<rss version="2.0">
  <channel>
    <link>https://www.w3.org/DesignIssues/</link>
    <title>Design Issues for the World Wide Web</title>
    <managingEditor>timbl@w3.org (Tim Berners-Lee)</managingEditor>

<description>These statements of architectural
  principle explain the thinking behind the specifications. These
  are personal notes by Tim Berners-Lee: they are not endorsed by
  W3C on anyone else. They are aimed at the technical community, to
  explain reasons, provide a framework to provide consistency for
  future developments, and avoid repetition of discussions once
  resolved.</description><lastBuildDate>Mon, 27 Jan 2025 21:41:04 GMT</lastBuildDate>

    <item>
      <pubDate>Tue, 23 Oct 2007 00:00:00 GMT</pubDate>
  <title>Levels of Abstraction: Net, Web, Graph</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Abstractions.html</link>
    <guid>https://www.w3.org/DesignIssues/Abstractions.html</guid>
      <description><![CDATA[
    <p class="conclusion">The web of things is built on the web of
    documents, which is built on the web of computers controlled by
    Domain Name owners, which itself is build on a set of
    interconnected cables. This is an architecture which provides a
    social backing to the names for things. It allows people to
    find out the social aspects of the things they are dealing
    with, such as provenance, trust, persistence, licensing and
    appropriate use as well as the raw data. It allows people to
    figure out what has gone wrong when things don't work, by
    making the responsibility clear.</p>
    <p>The value of this architecture is that each layer leverages
    the social components of the lower layer's architecture.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Abstractions.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 15 Oct 2012 00:00:00 GMT</pubDate>
  <title>Working despite Ambiguity</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Ambiguity.html</link>
    <guid>https://www.w3.org/DesignIssues/Ambiguity.html</guid>
      <description><![CDATA[
    <p>(I guess this is one of these things which is perennial. I
    have not studied much of the history of philosophy but I do
    find one needs to be prepared to jump in in order to keep the
    course of what I otherwise regard as engineering still on
    track... as I have said before, this is <a href="PhilosophicalEngineering">philosophical engineering</a> we are
    doing...)</p>
    <p>The point which David Booth has brought up, not for the
    first time, and which Pat has expounded very well, that no
    symbol can ever have completely unambiguous meaning is, yes,
    quite valid. There are several such points which we have to go
    over every now and again (preferably out of the critical path
    of working group work) and agree we all understand it and agree
    that we can all continue in practice without it. And indeed
    continue in theory without it as well. And Pat, you have lead
    us through that journey from philosophical foundationlessness
    to logical foundations before and maybe you can help us again
    or just point to where you did before. And Graham you make an
    important distinction.</p>
    <p>There are lots of models, I am sure, one can make of
    ambiguity and language and communication which will allow us to
    do this, and they may differ in how they work and it probably
    is best that we agree they exist but not get hung up arguing
    about which one is "right". They will all be imperfect, but
    good enough.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Ambiguity.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 04 Sep 1998 00:00:00 GMT</pubDate>
  <title>Web Architecture from 50,000 feet</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Architecture.html</link>
    <guid>https://www.w3.org/DesignIssues/Architecture.html</guid>
      <description><![CDATA[
    <p>This document attempts to be a high-level view of the
    architecture of the World Wide Web. It is not a definitive
    complete explanation, but it tries to enumerate the
    architectural decisions which have been made, show how they are
    related, and give references to more detailed material for
    those interested. Necessarily, from 50,000 feet, large things
    seem to get a small mention. It is architecture, then, in the
    sense of how things hopefully will fit together. I have
    resisted the urge, and requests, to try to write an
    architecture for a long time: This was from a feeling that a
    dead and therefore less valuable document must any attempt to
    select which, of all the living ideas, seem most stable,
    logically connected and essential. So we should recognize that
    while it might be slowly changing, this is also a living
    document.</p>
    <p>The document is written for those who are technically aware
    or intend soon to be, so it sparse on explanation and heavy in
    terms of terms.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Architecture.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 19 Dec 1996 00:00:00 GMT</pubDate>
  <title>Axioms of Web architecture: URIs</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Axioms.html</link>
    <guid>https://www.w3.org/DesignIssues/Axioms.html</guid>
      <description><![CDATA[
    <p>The operation of the World Wide Web, and its
    interoperability between platforms of differing hardware and
    software manufacturers, depend on the specifications of
    protocols such as HTTP, data formats such as HTML, and other
    syntaxes such as the URL or, more generally, URI
    specifications. Behind these specifications lie some important
    rules of behavior which determine the foundation of the
    properties of the Web. These are rules and principles upon
    which new designs of programs and the behavior of people must
    rely. And it is that reliance which makes the Web both an
    information space which works now, and the foundation for
    future applications, protocols, and extensions. The more
    essential of these I refer to loosely as axioms, and the most
    basic of these have to do with URI.<br></p>
    <p>The aim of thes article is to summarize in one place the
    axioms of Web architecture: those invariant aspects of Web
    design which are implied or stated in various specifications or
    in some cases simply part of the folk law of how the Web ought
    to be used. Especially for these latter cases, this article is
    designed to tie together the Web community in a common
    understanding of how we can progress, extend, and evolve the
    Web protocols. <i>Terms such as "axiom", and "theorem" are used
    with gay abandon rather than precision as this not a
    mathematical treatise.</i><br></p>
  
<p><a href="https://www.w3.org/DesignIssues/Axioms.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 26 May 2010 00:00:00 GMT</pubDate>
  <title>Linked data is like a Bag of Chips</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/BagOfChips.html</link>
    <guid>https://www.w3.org/DesignIssues/BagOfChips.html</guid>
      <description><![CDATA[
    <p>The value of data is the insight which comes when different
    bits of data are joined together. For that process to provide
    value, the world must contain all sorts of kinds of information
    of different types, and it must be linked together. Linked data
    involves using ontologies. But if you are a developer, how do
    you pick those ontologies? The art is to use several different
    ontologies in the same document, the same message. In a typical
    application, part of which you need to express will be in a
    very common idea, (like, say a title of a document) while part
    of the information will be concepts shared with particular
    groups, domains, like, say blood pressure. And some will be
    obscure data (like, say, blood pressure monitor calibration
    data) which is only understood by device engineers. Putting all
    this information together in a mixture of ontologies is the
    best thing to do. Some you will find, some you may work with
    others toward consensus, some you might use that day in that
    project. Using each of those ontologies gets you the most total
    interoperability. A bag of chips has all kinds of information
    of different types, and each user (the customer, the checkout
    scanner, celiac, the nutritionist) uses different bits and
    ignores the rest. With its mixture of ontologies and its rule
    of ignoring data you don't need, or you don't understand. the
    world of Linked Data is quite like a bag of chips.</p>
  
<p><a href="https://www.w3.org/DesignIssues/BagOfChips.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 13 Jan 2018 00:00:00 GMT</pubDate>
  <title>Beneficent Apps</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Beneficent.html</link>
    <guid>https://www.w3.org/DesignIssues/Beneficent.html</guid>
      <description><![CDATA[
  <div class="cols">
    <h1>Beneficent Apps</h1>
    <p>It is a sign of the times (2010's) that we even have to talk
    about these. Back in the days of floppy disk based PCs, when
    you would spend your money on a cool program (or App as we say
    now) you would spend money to be able to do useful, fun things.
    Play a game. Fly a plane. Write an essay. Do your taxes. You
    would typically have the program in the A: drive and put a disk
    for your data in the B: drive.</p>
    <p>The data on the B: drive was completely in your control. You
    could use it with different programs in the A: drive. The
    program in the A: drive was our tool. It helped you do your
    work. It worked for <i>you</i>.</p>
    <p>Until, that is, for me one day when Quicken, the program I
    had bought ages ago to do my finances and taxes, when it had
    done my taxes, asked words to the effect of "Are you sure you
    have enough insurance? Would you like to buy some great
    insurance?"</p>
    <p>That was the end of an era: the era in which I trusted
    Quicken to be my representative and work on my behalf. Of
    course millennials are so used to doing everything on the web,
    using web-based tools, and so used to those web-based tools
    actually working in someone else's interest, that they may
    assume this is normal, and take this as the default.</p>
    <p>But in fact in this world we loose something very important.
    The basic human ability to use a computer provides a wonderful
    level of empowerment. There is something important about a
    program which represents me.</p>
    <p>While Beneficent Apps are not the norm on the web, or even
    on mobile devices, they in fact common. <b>Every open source
    app is (should be!) a beneficent app</b>. These are apps which
    are developed by a community for its own use, and generally
    they are developed with the needs and wants of the end user in
    mind.</p>
    <p>Web browsers are important Beneficent Apps. They are crucial
    as the tool with which the person interacts the web and all the
    crazy wonderful stuff out there In the HTTP protocol spec, the
    browser is in fact called the User Agent. Web browses must
    protect and the serve the user in lots of ways</p>
    <ul>
      <li>Help the user know and understand what party (website
      owner) they actually talking to</li>
      <li>Help the user remember where they have gone</li>
      <li>Help the user curate a subset of the web which is
      valuable for them</li>
      <li>Store safely passwords and keys</li>
      <li>Help them avoid being tricked, fooled, or manipulated on
      line</li>
      <li>and so on</li>
    </ul>
    <h2>Tips</h2>
    <p>When designing a Beneficent App, always just think at each
    design choice -- what would the user want the app to do? If you
    are thinking of yourself as a main driving user, then think
    about use cases which empower, and connect to other powerful
    things you can do or will be able to do. But also think of
    users who are differ from you in many ways - their level of
    tech ability, their preferences about being social or not,
    their situation, their personality.</p>
    <p>If you have an income stream from selling the data from your
    users, then you are not likely to build a Beneficent App by
    default.</p>
    <h2>Metrics</h2>
    <p>How to you measure how good is a beneficent app? It is so
    easy to make metrics for non-beneficent apps: the engagement
    level, click-through, the ad revenue or sales revenue they
    provide. It is more difficult to measure how useful you have
    been to your user. A user may just be really well informed of
    something really important, but not do anything which your app
    could pick up. So yes you can survey them, but now lets's look
    at measuring their activity. If your app is a something which
    helps people organize parties, then you can measure the number
    of parties which people organize. (not to mention whether they
    were great parties :-).. but those end goals come infrequently,
    so you could measure the amount of stuff going on to the end of
    the end goal: the amount of chat (how about sentiment analysis
    of the chat as to whether it is happy, constructive?) the
    extent to which the to-do system works - do tasks get done by
    different people to the people who raise then, for example --
    once think of lots of potential things you could measure which
    may or may not be useful. Then when you have that list, you can
    look at previous parties and see ho they were involved with
    making great parties, or with actually getting the party
    organized at all (which of those would you want to optimize
    for?) You can also try your apps out, doing A/B testing in the
    next version as things to optimize for. But also beware of
    unexpected negative effects. Did they people building social
    networks imagine the effects on teenage health of a beauty
    based economy? Probably not, but now we know that systems build
    to be happy centers of collaboration can end up being toxic for
    classes of users. To be beneficent, you have to also do no
    harm!</p>
    <h2>Rugulating Agents</h2>
    <p>The concept of apps which work for you, the concept of
    something which is your agent, is not in fact foreign to our
    current world. We have it, after all with doctors, and with
    lawyers. A doctor takes the Hippocratic Oath, or some form of
    it, which they commit to operate in the interests of the
    patient. Lawyers also are bound to put the interest of their
    client first.</p>
    <p>There are therefore a lots of laws and regulations out
    there. to take inspiration from when wording commitments which
    Apps, or the developers which create them, if anyone wanted to
    craft regulations ot terms and conditions about Beneficent
    Apps</p>
    <h2>Beneficent AI</h2>As AI gets more powerful, every step it
    takes it becomes more important that it is beneficent.
    Beneficent for you the individual, and for us the human race.
    More important that you have AIs which work for you not someone
    or something else.
    <p>Hey online Ad system, Who do you work for? When you
    recommend I eat at a restaurant, is that the best one for, me
    or a the result of an instant online auction for who ever bids
    most to you for my custom? Hey, Siri, who do you work for? Hey,
    Alexa, who to you work for?</p>
    <p>Can you even imagine an AI that works for <em>you</em>? I
    can, and he's called <a href="Charlie.html">Charlie</a></p>.
    <hr>
    <h4>References</h4>
    <ul>
      <li>IETF <a href="https://www.rfc-editor.org/rfc/rfc7230">Hypertext Transfer
      Protocol (HTTP/1.1): Message Syntax and Routing</a>
      </li>
    </ul>
  </div>
  <div class="nav">
    <a href="https://www.rfc-editor.org/rfc/rfc7230"></a><a href="Overview.html">Up to Design Issues</a>
    <p><a href="Charlie.html">on: Charlie is a Beneficent
    AI</a></p>
    <p><a href="../People/Berners-Lee">Tim BL</a></p>
  </div>


]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 31 Oct 2017 00:00:00 GMT</pubDate>
  <title>Blockchain</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Blockchain.html</link>
    <guid>https://www.w3.org/DesignIssues/Blockchain.html</guid>
      <description><![CDATA[
  <h1>Blockchain and the Web</h1>
  <div class="cols">
    <p>There are so many frequently asked questions around
    Blockchain and the web, so this may be place to tease a few
    things out, specifically about the relationship between the
    technologies. What technical problem is there in the world in
    2017 where someone has not asked "Will the blockchain solve
    that?". The response is then typically "Well, what do you mean
    by <i>blockchain</i> exactly?", as it really depends which
    aspect of which system you are talking about. Are you talking
    about the original Bitcoin Blockchain, or some other use of
    blockchain technology, like Ethereum, or something which has a
    crypto-currency but quite a different protocol, like Ripple,
    and so on. And what were you using the system for - to transfer
    money, to claim a unique identity, or to notarize a document?
    It all depends. Now there are a huge number of things written
    on blockchain, so here I made no attempt to explain it or
    discuss its details, about which I am not an expert. Here I
    just try to fish out some of the distinctions between different
    things out there and how they relate to Web Architecture.
    Within that, many questions arise, some answered here, and some
    unanswered anywhere.</p>
    <h2>The Bitcoin Blockchain</h2>
    <p>When people talk about Blockchain, sometimes they mean the
    original Blockchain, in which Bitcoins are mined, and sometimes
    they mean anything which uses the same ideas, but are
    different. These are quite different. The original Bitcoin
    blockchain was the first the world saw of the technology and it
    proudly announced a new system which included:</p>
    <ul>
      <li>A shared global ledger on which things could no
      notarized</li>
      <li>A currency which could be traded rapidly and cheaply on
      the net</li>
      <li>Protocols to use the ledger as a first-come first-served
      name-allocation system</li>
      <li>Protocols to declare your public key in the web.</li>
      <li>Many other proposed exciting ideas</li>
    </ul>There were mutual dependencies:
    <ol>
      <li>The ledger depends on people mining Bitcoins to guarantee
      its integrity</li>
      <li>Bitcoins depend on the ledger for bring bought (split)
      and sold</li>
    </ol>There were a few anomalies, or ironic aspects. The ledger
    system works and maintains its integrity only because of the
    complete openness of the blocks in the chain, and the fact that
    many systems were constantly checking it; and yet bitcoin in
    fact provided a way that value could be used anonymously for
    criminal purposes with impunity. Material in or linked from
    blocks can of course be encrypted, and must be if if not
    public. In general the practice of encrypting stuff and making
    the encrypted version public requires faith that the encryption
    method don't be cracked in any timescale which is important to
    the parties. The cracking of encryption becomes ever easier
    with more powerful computers, latent vulnerabilities in the
    system, and new hacks to it.<br>
    <br>
    Also, for a new fast and light system of transfer of value, it
    was ironic that the mining needed to both generate new coin and
    maintain the verified integrity of the chain necessarily by
    design involves burning huge amounts of energy, which is not
    ecologically sustainable.
    <p>Using these protocols to the web, they connect in various
    ways:</p>
    <ul>
      <li>
        <a href="https://www.w3.org/TR/webauthn-2/">Web
        authentication</a> can use identities based on the
        blockchain. You could log onto a web server using a key
        pair whose public key is declared in the blockchain.
        Identity systems, some using "DID" standard URNs, like
        Sovrin, Veres One allow one to create an identity which
      </li>
      <li>
        <a href="https://www.w3.org/Payments/">Web Payments</a>
        defines a modular system within the browser which allows
        new forms of payment system to be inserted. Payment with
        Bitcoin, or any other crypto token, could be added, and I
        expect several will be added, perhaps in the basic browser
        itself, or using a browser extension.
      </li>
      <li>Any web site version could be notarized in a blockchain.
      Many websites are already managed using a version control
      system like Mercurial or git, which already keeps hashes of
      the entire tree of files. One could also create ones own hash
      of the state of a web sit, or a sub-tree of the file system
      behind it. Then, for example to claim (say) copyright at a
      certain time of a certain content, you could stick the hash
      into the blockchain.</li>
      <li>and so on.</li>
    </ul>
    <h3>The value of bitcoin</h3>
    <figure style="min-width: 15em; width:33%; border: 0.1em solid green; float: left;">
      <img title="Bitcoin vs USD [via xe.com]" style="margin: 1em; width:90%;" alt="Rising 500 to 7000 increasingly sharply over the years" src="diagrams/blockchain/2017-11-02-year-xbt-usd-xe.png">
      <figcaption style="margin: 0.5em;">
        The rise of the value of a Bitcoin against the US Dollar
        over the last few years to 2017-10-31
      </figcaption>
    </figure>
    <p>To first order, for lots of operations transferring money
    between different currencies and countries, the value (say in
    USD) of a bitcoin is irrelevant. For those making a remittance,
    say of USD from the USA to (say) Haiti, all one need to do is
    own some Bitcoin for a minute between buying them in the US and
    selling them in Haiti.</p>
    <figure style="min-width: 15em; width:33%; border: 0.1em solid green; float:right;">
      <img alt="dot com boom: also rising 500 to 7000 increasingly sharply over the years" style="margin: 0.5em; width:90%;" src="diagrams/blockchain/content_Fidelity-nasdaq-chart-_1999.png">
      <figcaption style="margin: 0.5em;">
        The NASDAQ up to 1999. People bought tech stocks because of
        the prices they imagine other people will be prepared to
        pay in future, with little justification from actual
        revenue of the companies.
      </figcaption>
    </figure>So does the current [2017] dramatic rise in the value
    of Bitcoin have no connection with the use of the Bitcoin
    Blockchain as a system? On the contrary, there is a strong
    connection. The promise of rise in value of Bitcoin motivates
    the miners, who keep the blockchain operating and ensure its
    integrity. In general you can imagine that people buy bitcoin
    for at least two reasons<br>
    <ul>
      <li>In order to make a transfer like the remittance
      above</li>
      <li>In order to store value, and invest in the possible rise
      in value of the token.</li>
    </ul>
    <p>The second in a more effective at raising the cost of the
    currency, as the first is a temporary need. Investing in
    bitcoin is, if you like, just to add risk to your life. There
    is no logic based on future revenue to attibute value to it.
    Its market value is only the extent to which other people
    imagine other people imagine other people will value it in
    future. It is is completely speculative.</p>
    <p>Systems which rely on the blockchain run the risk of
    breaking if bitcoin mining stops. The design of Bicoin ensures
    that to make each new coin, more energy is needed than to make
    the last one. So the cost of each coin in energy, and therefore
    in USD, constantly rises. This, I imagine, is why people behave
    as though the <i>value</i> of bitcoins will contantly rise.
    However there is nothing in the math that guarantees that will
    be anyone willing to pay that much say, US Dollars for it. If
    miners find they don't make a profit, then they will stop
    mining. When they stop mining, then presumably the sytsem
    stops, and no one can use the bitcoin blockchain to trade
    bitcoin.</p>
    <h3>Differences between Bitcoin Blockchain and the Web</h3>
    <p>Blockchain the web are similar only in that you can store
    stuff in it and later retreive it. The web and blockchain are
    very different.</p>
    <p>In a blockchain system, everybody stores everything. It is a
    distributed ledger where every node in the system stores a copy
    of the whole chain. The one blockcahin is stored by the whole
    set of servers. Each server is responsible to the same extent
    to make sure the stuff in the blockchain is stored and
    available.</p>
    <p>In the web, each web site is different. Hardly anybody
    stores the same thing. Each web site has different authors,
    different fans, different plans. Different requirements for
    size, shape and speed of the data.</p>
    <p>When you store something on the web, there are three things
    you need to keep working in order for it to be available in the
    future:</p>
    <ol>
      <li>The Internet infrastructure</li>
      <li>The Domain Name System</li>
      <li>The web server which you put the stuff on</li>
    </ol>
    <p>When you store something in the block chain, you need two
    things:</p>
    <ol>
      <li>The Internet infrastructure</li>
      <li>The community of people and organizations which together
      store the blockchain</li>
      <li>The economic and market conditions around that community
      to motivate the maintenance of the chain</li>
    </ol>
    <p>The persistence on the web server depends on the effort you
    put in to set it up and have it well hosted, and the money you
    or whoever it is spends to maintain the server and pay for its
    connection to the Internet. There are important organizations
    like the Internet Archive which keep copies of things, but
    modulo those, the continued operation of the site either
    depends on an entity, like forbes.com which sources the
    material to maintain its own site, or a company such as
    facebook, twitter, or github which serve the data for their
    users, for the members of their clubs. This doesn't mean that
    these will magically there for ever, as those who plowed their
    creativity into their <em>AOL Hometown</em> web sites found
    when AOL turned off hometown.</p>
    <p>For distributed systems like the blockchain, the
    responsibility for the maintenance of the data you put in them
    is shared by a single community. When we are talking about the
    Bitcoin Blockchain, then it is specifically the Bitcoin
    community. The companies which mine for bitcoin are crucial for
    the integrity of the system, as without them, there would be no
    new blocks on which to be able to put more information, and
    there would be no one checking the integrity of the data, old
    and new.</p>
    <p>If your vision if that everyone will use the same single
    blockchain, then you are asking them to accept the same
    "Quality of Service" properties: the same reliability, the same
    time it takes, the same cost. It is like requiring everyone to
    join the same one club with the same facilities, opening hours,
    and all pay the same fee. People are different and in fact want
    to join different sorts of clubs with very different sorts of
    facilities and very different fees. If the same Bitcoin
    blockchain is used by gamers to exchange moves in a distributed
    game, and retailers for consuming spending, and banks to record
    the transfer of ownership of real estate, then at some point
    aren't the banks going to object to maintaining an
    infrastructure used mostly by gamers, and the gamers object to
    paying transaction fees used by the banks? Won't the banks spin
    up a new system (like <a href="#Ripple">Ripple</a>, say) where
    anyone can join so long as you're a bank? Won't the gamers spin
    up their own Etherium chains because they can and they cheaper
    and they don't need persistence?</p>
    <p>Looked at in that light, in terms of social space of people
    who use it, and the economic space of its service parameters,
    the <b>Bitcoin blockchain is centralized</b>. It's not like the
    web, where everyone can make their own web site, in an
    independent way, and make as big and as small, and as fast and
    as slow as they like.</p>
    <h3>Instability of Currency</h3>
    <p>When people invest in a blockchain-based currency, in order
    to benefit from its later rise in value, they are taking a risk
    that the currency will drop A bit like investing in "Dot-Com"
    startups doing the boom, they are giving a currency a value
    baed purely on the imagination that others will in future value
    it at a given level -- not based on a revenue or interest which
    the system will provide. Risks include</p>
    <ul>
      <li>The value crashing as dot com values did</li>
      <li>Those who promote the currency being accused of running a
      form of ponzi scheme where you rely o future joiners to make
      worthwhile for those joining now, in an unsustainable
      way</li>
    </ul>So if you are looking at a blockchain as a place to store
    your data, be aware that you connecting it to a financial
    system, whose continued functioning will be required to keep
    the data accessible.
    <hr>
    <h2>Other blockchains in general</h2>
    <p>When you actually look at building an application to use the
    blockchain as the place where it stores its data, then there a
    few serious issues. Moxie Marlinspike'a <a href="#mm">blog</a>
    is one of several blogs on the subject. Three of the issues are
    privacy, speed and transaction cost, and re-centralization.</p>
    <h4>Blockchains are Public</h4>
    <p>When you put something on a blockchain, then the way that
    the blockchain works is that a copy of it is held by every node
    in the system. So it is very public. If you are using it for
    claiming a public global digital identity, then that may be
    what you want. But if you want to use it for something private,
    like personal data, then this is definitely <b>not</b> what you
    want.</p>
    <p>Yes, you can encrypt it. But encrypting your data for
    security and then putting it somewhere very public has two
    problems. One is that even if people can't decrypt they see
    that you have put something there, which may already be
    revealing. The other problem is that typically encryption gets
    easier to crack over time, with faster computers, (not to
    mention quantum) and sometimes discoveries of weaknesses in the
    algorithms.</p>So all the stuff on the blockchain can be held
    encrypted by people just waiting for a time when they are able
    to decrypt is.
    <h4>Blockchains are Slow</h4>If you put stuff on the
    blockchain, it takes a while. You have to come to a agreement
    with everyone using the chain what the next block will be.
    <h4>Blockchains are Expensive</h4>Blockchains workin different
    ways, but a common theme is 'gas fees'. The tokens you have to
    spend to
    <p>Of other blockchains, there are those which pretty much use
    the same protocol as Bitcoin, but are a distinct chain, and
    those which use related sorts of algorithm.</p>
    <p>One thing which distinguishes them is whether the crypto
    token value is tied to a service of value, like computation or
    storage, such that the protocol automatically guarantees that
    the service is delivered in return for the coin, and so linking
    the value of the coin to the costs of providing and value of
    the service.</p>
    <h4>Re-centralization</h4>
    <p>If the actual way a practical blockain app gets to put
    something in the chain is through an online service -- the
    blockchain code doesn't run on the user's computer, but on one
    of a small number of portals -- then the system isn't really
    decentralized, in effect. The monopoly portals are back in
    control.</p>
    <h3>Filecoin</h3>
    <p>Filecoin is a cryptocurrency, from the designers of IPFS, in
    which the currency value is related to the amount users are
    willing to pay , and providers willing to provide for, two
    services. One is the storage of information, and the other is
    the retrieval of information. So you can go to a storage
    provider with your encrypted family photos, you specify a
    storage time, pay some Filecoin and then the protocol provides,
    as a property of the protocol, that the storage provider will
    store the data for the time given. Well, it provides that in
    order to remain a play on the system, it must.</p>
    <p>You might ask, what happens if you can afford to store your
    stuff, but later on the market changes and you, or you readers,
    can't afford to access it? When you buy into the system, you
    are not guaranteeing that the filecoin world will exist, but
    that if it does, your data will be stored.</p>
    <h2>Decentralized non-blockchain protocols</h2>
    <p>These are summarized only rather than elaborated in depth.
    If you are thinking of stroring things on a blockchain, then
    are one of these in fact what you need?</p>
    <h4>Distributed Hash Tables (DHT)</h4>
    <p>Collaborating parties make their data into chunks each of
    which is hashed and then stored at a server chosen by indexing
    into the list of servers with the hash.</p>
    <p>The InterPlanetary File System (IPFS) The "InterPlanetary
    File System" [sic] is a project which allows a community to
    share immutable files indexed by their hashes.</p>
    <p>As IPFS can be used with a URL scheme ipfs:, a small browser
    extension allows it.</p>There is a an IPFS HTTP gateway.<br>
    <h2>Using the web as a Ledger</h2><br>
    Now let as look at doing some of the things which people do on
    blockchain on the web.<br>
    <h3>Using an arbitrary web page as the head of a ledger
    list</h3>
    <p>If you are prepared to trust a particular social entity with
    the head of your ledger, you can of course put it on their web
    site. It may not be as "decentralized" as putting on the
    bitcoin blockchain, but it will delegate the job of keeping the
    list to a known party. But one way of working is to use a
    trusted web site as your ledger. Then you can base all the
    typically blockchain operations on it such as notarizing
    transfers of ownership or money, staking first claims to unique
    names, and so on.</p>
    <h3>Digitally signed linked data</h3>
    <p>It is straightforward to digitally sign data. You can sign a
    serialized document and convey or publish that serialization,
    or you can canonicalize it and sign the canonicalized data
    model, canonicalized RDF (or XML or JSON). You can then chain
    together a series of signed documents, each with a URI of and
    also typically a hash of the ones to which it refers and which
    it depends on. The web of linked data is particularly suitable
    for this of course, and a read-write store of data allows
    applications which operate by making chains (or in general
    directed acyclic graphs) of digtally signed assertions to
    flourish. (See the <a href="PaperTrail.html">PaperTrail</a>
    architecture in these notes)</p>
    <h3>Trees of hash-addressed data in the web</h3>
    <p>Many of these systems refer to immutable data by its hash.
    IPFS, for example, and the immutable part of MaidSAFE. But of
    course hashed trees of immutable data are no stranger to the
    web, and much if it is underpinned by git and mercurial
    repositories. Here a hash is used to refer, securely, to a
    given specific version of a repository. So the web is full of
    <a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle
    trees</a>, which have similar proerties to an IPFS. An
    intersting possibility is to extend HTTP to surface the Merkle
    tree, so that versions, or immutable parts, of the existig web
    can be referred to in a secure way. This would allow a client,
    for example, to check our a version of a subtree of a web site,
    and load it into, or request it from, IPFS. This connects to
    the Memento framework for tracking the history of web sites.
    Basically, any tree of data on the web which is immutable can
    be secured and referred to by a hash, and this incluedes data,
    like the messages from past chats in a Solid Pod, which was
    once mutable but is then declared immutable.</p>
    <h2>Using existing web architecure in a less centralized
    way</h2>
    <p>If you want to use the web to store stuff, then the weak
    point, the main centralization, is the fact that you have to
    get a domain name. If you dn't get your own domain name (like
    <tt>alice.com</tt>), then you end up with you data stored at a
    URL which includes the domain name of your ISP (like
    <tt>alice.myisp.com</tt>) The latter means you are bound to
    using the same ISP forever (unless they arrange forwarding to a
    new provider). To all efforts to make it easier for people to
    get a domain name and then establish a web presence there, like
    a Solid POD, are useful. As are top level domains which respect
    their users.</p>
    <h2>Conclusion</h2>
    <p>Blockchain and the crypto currency protocols solve some
    interesting and useful problems, but none of them in the
    universal panacea which some have been looking for to fix our
    dependency on huge monopoly platforms. It is possible to switch
    certain functions, like website domain names, or personal
    identities, to blockchain-based protocols, but when that is
    done, the world has to be aware of a new dependency on the
    community which runs that system, as we did with DHTs. Economic
    models for the support of the system need to be elaborated, be
    transparent and and well understood. But in general use as a
    place for Web apps to to store data, blockchains are too slow,
    too expensive, and too public.</p>
    <hr>
    <br>
    <h3>References</h3>
    <ul>
      <li>
        <a id="mm">Marlinspike, Moxie</a> <a href="https://moxie.org/2022/01/07/web3-first-impressions.html">My
        first impressions of web3</a>. Essential reading
      </li>
      <li>
        <a href="https://www.xe.com/currencycharts/?from=XBT&amp;to=USD&amp;view=1Y">
        Currency charts at XE.com</a>
      </li>
      <li>
        <a id="Ripple" href="https://ripple.com/">Ripple.com</a>
        home page.
      </li>
      <li>
        <a href="https://filecoin.io/">Filecoin</a> home page
      </li>
      <li>
        <a href="https://ipfs.io/">IPFS</a> home page
      </li>
      <li>
        <a href="https://en.wikipedia.org/wiki/Distributed_hash_table">Distributed
        hash table (DHT)</a> in Wikipedia
      </li>
      <li>
        <a href="https://www.w3.org/blog/2016/08/memento-at-the-w3c/">Memento
        at W3C</a>
      </li>
      <li>
        <a href="https://tools.ietf.org/html/rfc7089">RFC7089</a>
        HTTP Framework for Time-Based Access to Resource States --
        Memento
      </li>
      <li>
        <a href="https://maidsafe.net/">The World's First
        Autonomous Data Network: The SAFE Network</a>
      </li>
      <li>
        <a href="https://www.youtube.com/watch?v=DaxU0ut5tUw">I
        want an iPhone 4</a>, YouTube
      </li>
      <li>Fidelity Australia, " <a href="https://www.livewiremarkets.com/wires/the-nasdaq-will-history-repeat-or-will-it-rhyme">
        The Nasdaq: Will history repeat or will it rhyme?</a>
      </li>
    </ul>
    <h3>Updates</h3>
    <ul>
      <li>CrytoSlate, 2022-12-12 <a href="https://cryptoslate.com/btc-is-now-cheaper-than-the-all-in-sustaining-cost-of-mining-btc">
        BTC is now cheaper than the all-in-sustaining cost of
        mining BTC</a>
      </li>
    </ul>
    <div class="nav">
    <hr>
    <p><a href="Overview.html">Up to Design Issues</a></p>
    <p><a href="../People/Berners-Lee">Tim BL</a></p>
  </div>
  </div>


]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Jan 2001 00:00:00 GMT</pubDate>
  <title>Conceptual Graphs and the semantic Web</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/CG.html</link>
    <guid>https://www.w3.org/DesignIssues/CG.html</guid>
      <description><![CDATA[
  <h1>Conceptual Graphs and the Semantic Web</h1>
  <p>To put it in a nutshell, Conceptual Graphs (CGs) are a logic
  language used for describing closed worlds of logic. They have
  traditionally had a strong emphasis on two-dimensional graphical
  representations, but there are conventional serializations, one
  "Linear Form" much comparable with <a href="Notation3.html">N3</a>, and one CG Interchange Format (CGIF)
  which is more official. With various pros and cons, they are
  basically as expressive as KIF -- and so in way only have to be
  webized to a basis for the Semantic Web.</p>
  <p>Here I go over a few differences and similarities between CGs
  and Semantic Web work based on RDF.</p>
  <p>I will ignore completely "nonsemantic information" ([1], sec2
  ) in this short comparison.</p>
  <h2 id="Webizing">Webizing CGs</h2>
  <p>Let's take the principles of <a href="Webize">webizing a
  language</a> and look at how that applies to CGIF or LF, to
  imagine a semantic web based on CGIF.</p>
  <p>The first thing we clearly have to so is modify the CG
  syntaxes so that each concept and each relation can be a first
  class object, by having a URI. The syntax modification is just to
  allow the characters in a URI to be included, so that an
  arbitrary concept can be referenced, or an arbitrary relation
  used. A typical way to map URI space to CG identifiers would be
  to make URI of a CGIF identifier a concatenation of the URI of
  the CGIF document, and a hash sign and the local CG identifier --
  making the local exsting identifier a fregament identifier in URI
  terms.</p>
  <p>Having mentally webized the language, then the question is how
  such a semantic web language maps onto say languages. This is
  simplified by the fact that the CG spec [1] gives a mapping to
  KIF.</p>
  <h2 id="Types">Types and Clases</h2>
  <p>CG and RDF share concept of type. CGs have the restriction
  that that the worlds of concepts and types, and that of
  relationships and relationship types, are disjoint. Therefore,
  you cannot use a CG to express something about a relation using a
  relation. If one wanted a true bidirectional mapping, then CGs
  would have (it seems at first reading) to more or less reify --
  to describe at a meta level - an arbitrary RDF graph. However,
  this would not in my opinion be useful. The designers of CGs
  intended this disjunction, and so the natural mapping is directly
  from CG concept types to RDF Classes, and from CG relations to
  Properties, and from CG Relation Types to RDF Classes which are
  subclasses of rdf:Property.</p>
  <p>The semantic web logic language has to be universal in that it
  must allow expression of any other language; but it certainly
  does not force every language to be universal itself.</p>
  <h2 id="Centralize">Centralized Notions in CGs</h2>
  <p>The CG concept of a knowledge base (KB) contains a few
  centralized ideas. These are not in fact architectural problem
  with CGs - they are just engineering decisions which were made
  without the web scaling requirement. Removing does no damage the
  CG idea at all.</p>
  <ul>
    <li>The ideal of a closed knowledge base, especially that there
    is a single catalog of all individuals. A KB contains a
    hierarchy of types, a hierarchy of relations, and a central
    catalog of individuals. The hierarchies are no trees, but
    acyclic graphs, so they do not pose a problem above the fact
    that they are closed - A KB must</li>
    <li>The fact that a concept is associated wiht a single type.
    In the semantic web, though the original creator of a Thing may
    define a type, logically statements made by third parties can
    equally well make type assertions about a thing, and those
    statements may be in the form of a rdf:type statement.</li>
    <li>A coreference set has to have a single dominant
    concept.</li>
  </ul>
  <h2 id="Difference">Properties and relations</h2>
  <p>The main difference which stands out at first reading is that
  RDF properties are always dyadic, while CG relations are
  monadic.</p>
  <p>The RDF base model, and the N3 method of extending it to a
  logical framework, seem to be supported as a base structure,
  although the lack of N-ary forms shows up as a mismatch, but the
  existence of arcs explicitly in the CG model of an N-adic
  relation suggests a natural mapping back into dyadic RDF when
  n&gt;2. This just leaves a little tension as the two forms
  coexist.</p>
  <p>The CG world is a bipartite graph - one composed of two
  relations and concepts, which are disjoint. The RDF world, while
  it does consist of links which can be thought of as going from
  thing, via a property, to a thing, does not make properties and
  things disjoint. Everything is a Thing.</p>
  <h2 id="Similartie">Striking similarities</h2>
  <p>Some similarities of the CG work and the semantic web to date
  are striking. Both are inspired largely by circles and arrows
  diagrams, and in LF and N3 this even shows though in some
  syntactic forms. People have through the ages been writing
  circles and arrows on whatever material they had to hand
  [Enquire, cavewriting] and in N3 I tried to take this very simply
  into unicode with</p>
  <pre>w3c:Michael  &gt;- org:member -&gt; w3c:team .
</pre>
  <p>There was a certain feeling of recognition on seeing John
  Sowa's</p>
  <pre>[Go]-
   (Agnt)-&gt;[Person: John]
   (Dest)-&gt;[City: Boston]
   (Inst)-&gt;[Bus].
</pre>
  <p>which in N3 would be</p>
  <pre>@prefix : &lt;#&gt;.
[a :Go]
   &gt;- :agent -&gt; [a :Person; = &lt;#John&gt;];
   &gt;- :dest -&gt; [ a :City; = &lt;#Boston&gt;];
   &gt;- :inst -&gt; [ a :Bus].
</pre>
  <p>remarkable down to the final period. Both syntaxes also have
  backward arrows a &lt;- (p) &lt;- b in CG's LF, and a&lt;-p-&lt;b
  in N3. (See also: <a href="../2000/10/swap/test/cg/bus.rdf">the
  same in RDF</a>)</p>
  <h2 id="Context">Contexts</h2>
  <p>The concept of "context" occurs very equivalently in CGs and
  N3, where in both cases a formula is built using quotation. In
  N3, the braces were introduced to encapsulate a set of
  information and talk about it as a set. Using an example from
  [1], loosely "Tom believes that Mary wants to marry a
  sailor":</p>
  <pre>[Person: Tom]&lt;-(Expr)&lt;-[Believe]-&gt;(Thme)-
   [Proposition:  [Person: Mary *x]&lt;-(Expr)&lt;-[Want]-&gt;(Thme)-
      [Situation:  [?x]&lt;-(Agnt)&lt;-[Marry]-&gt;(Thme)-&gt;[Sailor] ]].
</pre>
  <p>In N3 this would be, mapping dyadic relations to RDF
  properties,</p>
  <pre>&lt;#Tom&gt; a :Person; :believes [a :Proposition; = {
    &lt;#Mary&gt; a :Person; :wants [ a :Situation; = {
        &lt;#Mary&gt; :marriedTo [ a :Sailor ]
    ]}
]}.
</pre>
  <p>(In the above, the "=" is an statement of equivalence which
  makes up for the inability otherwise of N3 syntax to allow an
  anonymous context to be subject and object of a statement.) In
  RDF, my own style is to assume that often the type of a thing,
  when it can be deduced from the predicate's range or domain,
  should not be stated explicitly. For example, the object of any
  <em>believes</em> may be a proposition, and the object of any
  <em>wants</em> may be a situation. So an N3 expression of the
  above in practice might be more like:</p>
  <pre>&lt;#Tom&gt; :believes {
    &lt;#Mary&gt; :wants {
        &lt;#Mary&gt; :marriedTo [ a :Sailor ]
    }
}.
</pre>
  <p>Leaving aside the question of whether this is a good model for
  the English sentence, and a lot of philosophy and linguistics
  (which I generally avoid by not trying to express natural
  language). The CG world often uses diagrams, such as this one
  from [1] to describe the above formula:</p>
  <p style="text-align: center"><img src="Sowa/cgstand_files/tombelv.gif" alt="Tom belives Mary wants to marry"></p>
  <p>In N3, the circle-and-arrow diagram I would draw would include
  an arrow from the rectangle for the situation to the [circle] for
  the marriage to indicate that there is a universal quantification
  there.</p>
  <p>There are other mappings which once could made, none of which
  give quite such a neat result. One mapping of CGs to RDF would
  map the CG arcs to RDF properties, which for the above would
  be:</p>
  <pre>[ a :Belief;
    :expr &lt;#Tom&gt;;
    :thme: [ a Proposition; = {
        [   a :Want;
            :expr &lt;#Mary&gt;;
            :thme [ a :Situation; = {
                [ a :Marriage; :agent &lt;#Mary&gt;; :thme: [a :Sailor]]
            }
        ]
    }]
].
</pre>
  <p>In English this would be, "There is a belief, experienced by
  Tom, that "there is a want, felt by Mary, that there should be a
  situation: ``Mary is married to a Sailor'' ".</p>
  <h2 id="Quantifier">Quantifiers and Lambda</h2>
  <p>I have not gone into the comparison in great detail in this
  area. Both N3 and CFIF have existential and universal
  quantification, though the universal quantification is declared
  an area of the spec under development called "defined
  quantifiers". Both have, like RDF, implicit existential
  quantification from anonymous nodes.</p>
  <p>A question I did not resolve in CGIF if how one can determine
  the scope of a quantifier introduced using the "?x" and "*x"
  terminology. There was a clarification in [1] that (I think)
  universal quantifiers have a higher scope than existentials of
  the same scope -- the same convention as in N3. In N3 in the
  model one has to link the quantified variable directly to its
  scope context using a log:forAll or log:forSome statement.</p>
  <p>N3 has no Lambda as such. Once can write out a double
  implication define the meaning of a new term (Property or small
  set of related properties) by giving a double implication with
  the equivalent formula, using universally quantified variables
  for the formal parameters.</p>
  <p>The issues faced in the two designs do a appear to have a high
  overlap. The semantic web has to work also in an open context,
  defining the meaning, if any, of a nested expression when
  referred to out of context.</p>
  <h2 id="Conclusion">Conclusion</h2>
  <p>Conceptual Graphs are easily integrated with the Semantic Web
  as it is, the mapping being apparently very straightforward. The
  export of a CG in CGIF or LF into N3 looks to be a suitable
  exercise for the reader ;-). An interesting and more challenging
  exercise would be to build a CG machine -- and a modified CG
  syntax -- which can import a graph containing URIs which
  reference external concepts. The problem that relation types in
  CGs are not concepts is not huge, as there are many systems -
  especially ontological systems -which have a similar restrictions
  and with whom interchange would be possible.</p>
  <p>There is an interesting subset of CGs, called "simple graph"
  which are all one context, with no negations or "defined
  quantifiers", but which can contain universal quantifiers, and
  these map directly into the RDF M&amp;S 1.0, or N3 without
  braces.</p>
  <p>The RDF base model, and the N3 method of extending it to a
  logical framework, seem to be supported as a base structure,
  although the lack of N-ary forms shows up as a mismatch.</p>
  <p>All in all, there is a huge overlap, making the two
  technologies very comparable and hopefully easily
  interworkable.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Jan 2017 00:00:00 GMT</pubDate>
  <title>Charlie: An AI that works for you</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Charlie.html</link>
    <guid>https://www.w3.org/DesignIssues/Charlie.html</guid>
      <description><![CDATA[
  <h1>Charlie works for Bob</h1>
  <div>
    <hr>
    <div class="cols&quot;">
      <p>Bob was fed up with the AIs around him (Alexa, Siri and so
      on) who all seemed to work for other people, and so he got
      Charlie. Charlie is an AI and Charlie works for Bob. Because
      Charlie works for Bob, Bob gives Charlie access to much more
      data than he would another AI</p>
      <p>- Charlie, Who do your work, for?</p>
      <p>- I work for you, Bob</p>
      <p>Good Morning, Bob. Good to see you on the exercise bike.
      Your fitness goals are on track. In fact because the meeting
      was moved do you want to stretch this to the full hour? We
      can do some climbs and have time to unwind.</p>
      <p>- Ok , sure</p>
      <p>- Ok, so lets start warming up at 100 cadence, to warm up,
      and we can go over a few things. Overnight a bunch of offers
      came in for your art but as far as I could tell none of them
      really make sense to you after you’ve paid the fees. I just
      invested a little in one new start up, mainly because it will
      give you something else in common with your mother in law.
      Speaking of relatives, you have quite a bunch of vegans
      coming on Saturday. I took the liberty of making up a recipe
      for the thing you really liked at the Indian cafe the other
      day.</p>
      <p>- You made up a recipe?</p>
      <p>Well he hadn’t published that one, but he has published a
      dozen books so I read those as a training set and then
      extrapolated how he would cook the menu you liked. Then
      compared it with the Linked Open Recipe data, and adjusted it
      a bit for the way you like things. So I propose to get the
      food from Whole Foods, Waitrose, and the Farm — we can get
      the best stuff and save 12% on the bill. OK?</p>
      <p>- Ok. Charlie, Let’s go for it</p>
      <p>- Ok, the recipe is in your calendar. I want to leave you
      to get into your workout now, When you are done there are two
      things: a new briefing for your meeting today, and the
      upcoming family birthday presents. I’ve found a bunch of
      things but I’m not sure they are right — I want you to look
      at them. OK?</p>
      <p>- Ok, Charlie. Who do you work for?</p>
      <p>- Legally, ethically and algorithmically, I work for you,
      Bob.</p>
      <hr>
      <p>Two things to notice about Charlie.</p>
      <ol>
        <li>Charlie works for Bob, and so Bob trusts Charlie</li>
        <li>Because Bob trusts Charlie, Bob gives Charlie access
        any of and all the data in is life - financial, health,
        social, etc. Because Charlie gets access across the board,
        Charlie does a better job — and so Bob trusts him
        more.</li>
      </ol>
      <p>Data is always more powerful when it is joined with other
      different data to give new insights.</p>
      <p>Currently Facebook makes insights about the likes and
      habits of its members Here, Charlie is getting the insights
      on behalf of Bob</p>
      <p>How could this happen? How could Bob get to the point
      where he has access to the data?</p>
      <ul>
        <li>We may be in for a massive disruptive backlash,
        following Cambridge Analytica, in which people demand
        access to their own data.</li>
        <li>In the banking sector in the UK, this has already
        happened in Open Banking, where consumers can use their
        data with all kinds of apps and services. The positive
        affects of this may spawn similar rules in other fields,
        The GPDR rules in Europe basically call for the sort of
        thing Charlie needs.</li>
        <li>We have [2017] a project at MIT (called solid.mit.edu)
        where we build apps which actually run so as to store their
        data in one or other data store which a user points them
        at. So whether it is event planning or bridge building, the
        actual data of the creative and collaborative things Bob
        does are created immediately in place over which Bob has
        complete control. Bob has complete control of all his
        data.</li>
      </ul>
      <p>So Bob may end getting his data because he gets mad and
      demands it, because regulations grant it to him, or because
      in a new architecture it has always been his.<br>
      Bob is empowered because he can share his data with whoever
      he likes.<br>
      Bob is empowered because he can use all kinds of very
      powerful apps, including Charlie<br>
      A new very different vision of the world.<br>
      A more empowered humanity</p>
      <p>Do join me in building it.</p>
      <hr>
      <h3>Update</h3>
      <p>[2023] Since that piece was written, a couple of things
      have changed. The Solid platform has gone from being a
      project at MIT to being a signifiant movement of new
      standards for personal data and individual sovereignty over
      that data. Large corporations and governments, and
      organizations in the public interest have in different but
      complementary ways started to roll out Solid for citizens and
      consumers.</p>
      <p>And AI systems based on Large Language Models have
      demonstrated that a fluid conversation with a human is now a
      thing AI can do, rather than a thing AI can't do. Now Solid
      gives the third layer of the web we have common standard
      which allows people to not only look at aggregation of data
      with a individual's Pod, but also to run machine learning and
      other insight-extraction systems over a set of pods, while
      preserving the privacy of the individual.</p>
      <p>[2003-10] A 2022 company Inflection AI released their
      Personal Intelligence (pi.ai) product in May 2023. You can
      have a private conversation with it about about your personal
      issues, though it does not have access to your personal
      data.</p>
    </div>
    <hr>
    <p><a href="Overview.html">Up to Design Issues</a></p>
    <p><a href="Beneficent.html">Back to Beneficent Apps</a></p>
    <p><a href="Singularity13.html">On to imagining what could
    really go wrong with AI</a></p>
    <p><a href="../People/Berners-Lee">Tim BL</a></p>
  </div>


]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 17 Aug 2009 00:00:00 GMT</pubDate>
  <title>Solid: Socially Aware Cloud Storage</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/CloudStorage.html</link>
    <guid>https://www.w3.org/DesignIssues/CloudStorage.html</guid>
      <description><![CDATA[There is an architecture in which a few
  existing or Web protocols are gathered together with some glue to
  make a world wide system in which applications (desktop or Web
  Application) can work on top of a layer of commodity read-write
  storage. Crucial design issues are that principals (users) and
  groups are identifies by URIs, and so are global in scope, and
  that elements of storage are access controlled using those global
  identifiers. The result is that storage becomes a commodity,
  independent of the application running on it.
<p><a href="https://www.w3.org/DesignIssues/CloudStorage.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 04 Jan 2004 00:00:00 GMT</pubDate>
  <title>Connecting the Sciences</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/ConnectingScience.html</link>
    <guid>https://www.w3.org/DesignIssues/ConnectingScience.html</guid>
      <description><![CDATA[
  <h1>Connecting the Sciences</h1>
  <p>with the <a href="http://www.w3.org/2001/sw/">Semantic
  Web</a></p>
  <h4>Summary</h4>
  <p><em>It interesting to use the Semantic Web for connecting the
  sciences because increasingly major problems can only be solved
  by using many fields at once; and because scientific information
  naturally tends to be "data", ie. relational, logical and/or
  numeric in form, and so Semantic Web technology is easy to
  apply.</em></p>
  <h2>The need</h2>
  <p>No scientific discipline is as island. The fields of study to
  which we give names have fuzzy edges, and overlap one another.
  They are in fact connected in a loose web which evolves with
  time, as new fields arise, and we change our perceptions of
  existing ones. Consider physics , physical chemistry, organic
  chemistry, cell biology, proteomics, genetics, epidemiology,
  medicine, pharmacology: wheras one might be an expert in one
  without being an expert in all of them, one typically has to have
  a knowledge of neighboring fields.</p>
  <p>Of the challenges which confront science, many interesting
  ones, particularly in the study of the human biology, seem to
  require the tracing of pathways though many fields. In searches
  for cures for AIDS, for cancer, or for new viruses such as the
  SARS, the amount of information to be brought to bear is huge,
  but spans many disciplines.</p>
  <p>Now, naturally, different fields have come up with different
  ways of modelling their data, different standards for recording
  it. This makes it very difficult to try out new ideas which cross
  fields: one has to negotiate for the conversion and transfer of
  data in each case. This is normal. It takes great time and effort
  to bring more than one group together to use common data formats
  and common vocabularies.</p>
  <h2>The solution</h2>
  <p>The Semantic Web technology is designed specifically to
  overcome this problem in a decentralized fashion. That is, it is
  designed to allow conceptual connectivity between neighboring
  fields to be set up retrospectively and incrementally.
  Retrospectively, in that often the modelling has already been
  done in each field and the data already exists. The overlap of
  concepts only partial, but adding the metadata which expresses
  that overlap where is does exist is valuable. Incrementally,n
  that one does not re in that one does not redesign the data
  models at once, but instead work at the interfaces progressively
  building links between related concepts.</p>
  <p>The Semantic Web language rise above the level of XML, at
  which document structure is defined, to the level at which the
  classes of real things in the field in question are defined, the
  relationships between them and their properties.</p>
  <h2>Openness</h2>
  <p>During the early years of the WWW, an element of reluctance
  was a hesitation by companies to allow information such as their
  catalogs or parts lists to be available to the general public.
  This hesitation evaporated when it became clear that only those
  companies about whose products information was freely available
  on the web were likely to be involved in any commerce at all.
  Currently, funders of science have been known to bemoan the
  disappearance of the original data upon which reports and papers
  were based. We discovered with the web of human-readable
  information that much of the benefit was serendipitous:
  information was used to advantage in ways that its publisher
  could never have imagined, and the enquirer who started off
  surfing for a particular solution often finds quite different
  solutions to that envisaged, not to mention solutions to quite
  different, but equally pressing problems.</p>
  <p>The history of science is peppered with discoveries made
  serendipitously - from the proverbial bath of Archimedes through
  the discovery of penicillin, to the discovery of the effect
  Viagra. If we are to make new discoveries using information on a
  huge scale, we will need to emulate the openness of the minds of
  these researchers by making scientific data available in a
  Semantic Web so that crazy hypotheses can be tested in a few
  moments harnessing data from many diverse fields.</p>
  <p>Indeed, science itself is not an island, as, for example, a
  epidemiological survey often yields results when joined with
  geographical an economic data. The search for a disease outbreak
  could take one into weather patterns, corporate financial
  statements, or flight timetables. It is important that the
  scientific Semantic Web is seen as one interoperable part of the
  larger Semantic Web.</p>
  <p>One particular aspect of openness is the lab notebook. The
  notebook is by tradition a write-only medium in which the
  scientist writes what he or she did, the environmental conditions
  a the time, and the results observed. Often such information
  fades but occasionally it becomes important after the fact.
  Semantic Web standards, and the use of Semantic Web-aware
  instrumentation, may make the recording of these incidental
  things easier. By analogy with the lab notebook, a researcher
  group may keep a lot of metadata which it may not wish to
  publish, at the time but which may be useful to posterity. For
  this information, we need to find a suitable policy which works
  for everyone involved.</p>
  <p>In the longer term, the Semantic Web will by its existence
  highlight issues such as privacy, the anonymizing of clinical
  trail information, the protection of possibly security-sensitive
  infrastructure information, the meaning of copyright especially
  of compilations, and so on.</p>
  <h2>Early Steps</h2>
  <p>Although there is much work yet to do in developing Semantic
  Web technology, basic standards exist. The Web ontology Language
  (<a href="http://www.w3.org/2001/sw/WebOnt/">OWL</a>) allows
  ontologies to be written so they can be read and processed by
  machine; the Resource Description Framework (<a href="http://www.w3.org/RDF">RDF</a>) allows data to be published
  using OWL ontologies, so that the data itself can be published
  and re-used by others.</p>
  <p>The building of the Semantic Web is a distributed,
  decentralized task. It behooves those of us who have information
  which may be useful to others to model it carefully, to discuss
  ontologies with our neighbors, and expose the information on the
  web. (It would not be unreasonable to make such publication a
  condition of funding.) What can be done to encourage this in the
  early days, to get the snowball rolling?</p>
  <p>Firstly, it would be useful to create some simple ontologies
  for common basic concepts of science. Weights and measures, the
  periodic table, physical constants, and simply molecules cry out
  for a standard description. The sort of data would be valuable as
  a basis for much more complex scientific data, but also would be
  a great resource for schools. This basic ontology and dataset
  would also be a service for other fields: one could see chemical
  data being used as a basis for hazard information, for food and
  drug information, and for the chemical supply industry, for
  example.</p>
  <p>Secondly, a few example datasets of great general value would
  demonstrators of how things should be done, and probably give
  rise to new tools and experience to be passed on. Geophysical,
  meteorological, pharmaceutical incompatibility information, e
  many candidates for early adoption.and genome data come to mind,
  but there must be many candidates for early adoption.</p>
  <p>Initiatives to bring scientific data to the Semantic Web could
  originate in individual researchers, by funding groups, by
  journals, or scientific associations or academies. If the grow
  can be compared with can be compared with that of the early WWW,
  it will occurs wherever an individual person understands the
  potential long-term global benefit, and so finds a way to put in
  the short-term effort to make it happen locally.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Nov 2000 00:00:00 GMT</pubDate>
  <title>Conversations and State</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Conversations.html</link>
    <guid>https://www.w3.org/DesignIssues/Conversations.html</guid>
      <description><![CDATA[
  <h1>Conversations and state</h1>
  <p>See also: <a href="PaperTrail">Paper Trail</a> - presented as
  a a student project</p>
  <p>The basic model of the web is a world of information.
  Theoretically, a mapping between URIs and representations of the
  resources they identify, and experientially fro a person a space
  one can navigate.</p>
  <p>Interestingingly, trends at the leading edge of user interface
  development, and at the semantic web development both point to a
  world which uses a different model. Human interfaces are moving
  from screens to conversational mode. The semantic web, while very
  exciting when viewed as a</p>
  <p>Human user interfaces use more and more devices such as
  speech, gestures and so on, which are not screens. What is
  special about a screen? A screen with a window system presents a
  large amount of informatoin at the same time to a person. In
  practice, more or less everything which a person is concentrting
  on at one time can be presented in its current state. When the
  number of pixels on a screen broke through a certain threshold
  (roughly the 640x320 VGA limit) this led to the development of
  direct manipulation interface metaphors: folders one could open,
  and drag and drop. The essential things about this is that the
  computer is at every instant presenting the current state,
  whether it or the human is manipulating it. The communication
  betwen personand machine is in terms of the mutual manipulation
  of a shared state. The web was intended to extend that form of
  communication by mutual manipulation of a shared state to remote
  human-human interaction. While the tools and protocols have their
  limitations (see UI) much of its effectiveness derived from this
  model. Because fundamental thing is a shared space of
  information, one can talk about navigation around within the
  space, and use all the primaval facilities that the human memory
  has for navigation.</p>
  <p>This is all very well, but it was not always so. When computer
  terminals had only 24 rows of 80 characters, even when they were
  addressable, there was a tendency for most jobs to use command
  line interafaces, for example when manipulating files and
  directories. The interface was conversational, in that the
  exchanges were small commands and responses. There was a shared
  abstract state, but it was imagined in the abstract by the
  person, and held in some unvisualized form by the computer. This
  too has itas advantages, in that the imagination of a person can
  well exceed (on a good day) the capacity of a screen in its
  ability to hold complex interrelated structures. The interesting
  thing is that now there is a tednedncy to use many devices which
  do not have the large screen. The screens on cellphones are
  currently so small that, while one can scale a web page down and
  adapt it to a small screen, this might be chosing simply the
  wrong interface metaphor. When the audio phone only is used, then
  the shared state becomes zero and the interface is completely
  conversational again.</p>
  <p>The characteristic of a conversation is the state is the set
  of utterances, or messages, which have been conveyed. This is
  differenet from a shared expression of a commonly agreed state.
  The <a href="PaperTrail.html">Paper Trail concept</a> links these
  two modesl in the Semantic Wee Semantic Web, by formally defining
  the overal agreed state as a function of messages to date. A
  service which allows a phone user to browse the web converts the
  other way: it conveys part of the the space of information by
  means of a conversation. It is is important for a number of
  reasons.</p>
  <ul>
    <li>It allows us to formalize the models of human-machine
    interface which are in fact conversational for many non-screen
    devices;</li>
    <li>It allows us to formalize social, for example commercial,
    transactions for which the paper trail is in fact th emost
    accurate model anyway;</li>
    <li>It provides us with tools we can use for formally analysing
    the infrastructure protocols such as HTTP which with which the
    information world is actually implemented in practice.</li>
    <li>The standardization of XML protocols has, with XML (and
    RDF), a richness in terms of marshalling data formats to build
    on, and, with xml-schemas xforms and rdfs, a richness to draw
    on in terms of languages for defining valid documents, but has
    no basis yet for defining with equivalent power the validity
    (and semantics) of a sequence of interrlated messages which are
    a protocol.</li>
  </ul>
  <p>It is not as though the web today itself perfectly matches the
  stateless model at all. The moment it was created as a basically
  stateles system, many web site designers took it as their
  challenge to get around this model in order to create a
  conversational interface -- and many still do Our concerns about
  privacy stem largel;y from the knowledge that our "reading" of
  documents is in fact done by a series of protocols which leave a
  trace. The P3P project involves quantifying the information
  transfer which actually takes place. Our handling of HTML forms
  is getting more complex, and a form itself, becomes, on many
  sites, the definition os a protocol - a set of valid sequences of
  information actions.</p>
    <p>
      @@ - already web privacy concerns come from in fact it being
      a conversation == there is implict state. A
    </p>
    <p>
      @@ Reasons for formalizaing protcols a la Paper Trail.: uses
      concepts of validation and will be able to resuse tools -
      extends semnatics of documnets to semnatics of conversaions.
      - Creates a formal basis for defining conversaionsal systems
      of all kinds, including indirctly human language oriented
      systems.
    </p>
    <p>
      @@ Machine-machines and human-human convergence
    </p>
    
  ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Jan 2007 00:00:00 GMT</pubDate>
  <title>Cultures and boundaries</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Culture.html</link>
    <guid>https://www.w3.org/DesignIssues/Culture.html</guid>
      <description><![CDATA[
  <h1>Cultures and boundaries</h1>
  <p>When a group of people communicate amongst themselves, they
  develop, to a certain extent, their own language. Sometimes, they
  pick terms understood by one party, talk enough to develop a
  shared understanding of the meaning of the the term, and adopt it
  across the group. Sometimes, as discussion proceeds within the
  group, meanings are adjusted so that they can be used for new
  concepts which are created or discovered by the group's activity.
  Sometimes, a group will deliberately and quite specifically make
  up a new term, choosing it to be hopefully different from any
  other word or phrase used before. While this is evident in
  technical groups, this process also happens in all walks of life,
  legal and political, as well as social and familial.</p>
  <p>The result of this process is a new language, a new strain of
  language, or just a twist in the use of an existing word. The
  first and motivating effect of this, large or small, is to enable
  communication within the group. A greater shared vocabulary
  broadens the scope of common discussion which can be made without
  misunderstanding. The second, complementary and inexorable effect
  of this change is to create a common bond within the group, which
  at the same time, erects a barrier around the group. In most
  cases, all of this is unintentional. For every linguistic
  development which promotes communication within the group, a
  corresponding step change is made in the difficulty of
  communicating across the boundary, between inside and outside the
  group</p>
  <blockquote>
    <p>That which makes a group culture stronger necessarily
    isolates it from others.</p>
  </blockquote>
  <p>The culture of the group comprises many things, but the common
  terms and their meaning, and the set of concepts net which
  interconnects those meanings, are a very significant part.</p>
  <p>So it is, then, that a working group will, given a free rein,
  work in relative isolation for several months, and when they have
  finished have great difficulty explaining the specification
  documents to their peers outside the group. Often this will be a
  surprise to the members of the group. They may see those outside
  the group as rather slow to understand, and those outside the
  group might see the group as having a tendency to use jargon, or
  to misuse jargon. It may be worse: those on one side of the
  boundary may see those on the other as being stupid, malicious,
  or even heretical.</p>
  <p>An incomplete but essential solution to the problem is for
  those involved to think about what those on the other side of the
  boundary are thinking. This is hard work. (It involves, we
  discover, use of specific parts of the brain! [<a href="#SaxePowell">SaxePowell</a>] ). This is the job, in a
  conversation, of listening, the stuff of most manuals and
  self-help books on human communication. In a technical setting it
  can involve a careful study of the words used in the other's
  seemingly senseless protestations, to build up logically a
  conclusion of how those words must be related in the other's
  mind.</p>
  <p>The process of forming a common culture for a large community
  is, therefore, full of this work of listening to others. It
  slowly builds a new set of common terms. The work of taking an
  specification from one group, and though review and discussion,
  getting it to be th subject of consensus in a wide group, will
  typically involve reexamining the terms it uses and often
  changing them as the group itself goes through the process which
  the individuals, or for that matter smaller groups, within it had
  already done. The motivation, for technical specifications, is to
  get wider interoperability of systems. The motivation, for
  diplomatic and political things, is to get a common decisions,
  and to reduce global strife.</p>
  <blockquote>
    <p>There is constant tension between the need to get things
    done quickly with less effort, by working within a small group,
    and the need to get this wider understanding which takes so
    much more time.</p>
  </blockquote>
  <p>Now, in practice, life is made up of a <a href="Fractal.html">fractal tangle</a> of overlapping communities, of
  overlapping cultures. This means that the tension is
  ever-present. It also means that there is always a small amount
  of common language shared by a very large number of people, and
  always a very large number of concepts local to an individual,
  and everything in between. In centuries before this one,
  geography played an important role in constraining groups, and so
  nested two-dimensional pattern existed.</p>
  <p>With the Internet and the Web, we can connect things without
  the constraint of these nested geographical areas. We can chose
  to be a member not just of communities such as town, region,
  state and country, but of specialists in a given field, or people
  with a particular medical condition, or people concerned about a
  particular global issue,. world wide. This means that the
  topology of the communities, and the connectedness by some
  metrics, may be different and in fact better than before. The
  topology which emerges depends on the individual choices of many
  people. But there is a hunch that a fractal distribution,
  emphasizing all scales, will be important.</p>
  <p>The Semantic Web is a technology engineered specifically for
  this situation. Terms are defined in ontologies (groups of
  consistent, related terms). Ontologies are defined by
  communities. A given person is involved in many communities. A
  given message will mix terms from many ontologies. A given
  operation only requires consistency between parts of ontologies
  which are in use for that operation.</p>
  <p>This will promote work toward greater harmonization, but it
  will not predicate the operation upon the establishment of a
  global ontology of everything. We know that a single huge
  ontology of everything cannot be done, as it the effort of
  getting consensus on it becomes unimaginable. We know that
  stovepipe systems with only local ontologies leave us with
  communication, and especially the re-use of data, which just does
  not happen, to our great detriment. And so we engineer
  specifically for a fractal topology.</p>
  <h3>References</h3>
  <p>(There are many books on these topics.)</p>
  <p>[<a name="SaxePowell" id="SaxePowell">SaxePowell</a>] Rebecca
  Saxe and Lindsey J. Powell, <em><a href="http://www.mit.edu/~saxe/Saxe_Powell.pdf">It’s the Thought That
  Counts: Specific Brain Regions for One Component of Theory of
  Mind</a>.</em></p>
  <div class="nav">
  <hr>
  <p><a href="BagOfChips.html">On to Linked Data is like a Bag of
  Chips</a></p>
  <p><a href="Overview.html">Up to Design Issues</a></p>
  <p><a href="../People/Berners-Lee">Tim BL</a></p>
  </div>


]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 10 Dec 2017 00:00:00 GMT</pubDate>
  <title>General Computation, Digital Rights Management, and FOSS</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/DRM.html</link>
    <guid>https://www.w3.org/DesignIssues/DRM.html</guid>
      <description><![CDATA[
  <h1>General Computation, Digital Rights Management, and FOSS</h1>
  <p>2013: The discussion of the good and bad of Digital Rights
  Management software is wide and furious and has been for many
  years. It connects to the whole issue of how broken copyright law
  is and how musicians and film producers should be recompensed for
  their hard work. In the fervent discussion, very extreme
  positions have been taken, which has led to the debate becoming
  acrimonious, to the extent that much more heat than light tends
  to be available. Here we tease out some separate issues which
  have become entangled.</p>
  <ul>
    <li>The Open Source Software right to have, modify and
    distribute a copy of the source code to a program one is
    running.</li>
    <li>The right to have root [Administrator Privileges] on one's
    computer, i.e. to be able to completely control what software
    it runs.</li>
    <li>The right to be able to make a copy of something one is
    listening to or watching.</li>
    <ul>
      <li>fair use like quotation, parody, etc</li>
      <li>archiving</li>
      <li>use on a different machine</li>
    </ul>
    <li>The right to be able to sell music or video in an encrypted
    form so it cannot be stolen.</li>
  </ul>
  <p>DRM video in HTML5 is a tricky issue. Around 2013 the W3C
  community was split as to whether it should be allowed. The
  Electronic Freedom Foundation, and Cory Doctorow the author and
  blogger were very adamant that HTML should not allow DRM, it
  represented a step toward big company control of computing
  platforms. It is impossible to build a DRM machine which has the
  open source condition that user can change it.</p>
  <p>One argument was at the level of tactics, basically, there are
  companies who will never put their movies on the net without DRM,
  so basically if we don't put DRM hooks in HTML they will just
  stick with flash and force you install an app, or a completely
  closed platform like a set-top box - in other words use a
  completely locked down platform. So it isn't as though making DRM
  more difficult in HTML5 will make DRM go away: it will just force
  users off the web.</p>
  <p>So should W3C just say "DRM is evil we should not collaborate
  with it in any way" and end up driving people to native apps
  where the whole app is locked down on a locked down platform, or
  should we open up a slot in an open system to allow a locked-down
  system to be accessed?</p>
  <p>Looking at the philosophical objections to DRM, there is no
  perfect solution. Everything violates one or more of the rights
  we want to preserve.</p>
  <p>Some people just feel copyright is wrong and so there is a
  right to make a copy of anything you see or hear. And the
  business model for musicians if live gigs and donations.</p>
  <p>Some people feel that they are the best judge of when they
  will copy something, as while they do often pay for music etc
  they feel (a) copyright law has been twisted and applies e.g. to
  30 year old movies when it should not, and (b) DRM is too extreme
  as it prevents normal things like fair use, backups, and
  typically fails in the future when the DRM support system has
  changes and all your archive files become unusable.</p>
  <p>Some people may feel that DRM is worth the value of having a
  thriving music industry and film industry. They are not too
  fussed about the archive issue as they don't really watch movies
  they have bought a long time ago, or they are too young to have
  experienced the problem. They haven't answered the problem of
  getting money to for example bands and singer-songwriters who do
  not have the blessing of a DRM distribution channel.</p>
  <p>Some people may feel that while they don't specifically want
  to steal movies, they do object to having any bit of
  computational hardware which they don't have root on.</p>
  <p>A related question is, what sort of systems can we build to
  help people give money to those who e.g. write or perform music,
  with or without DRM, in an open market, with no 3rd party
  gatekeeper?</p>
  <h2>A decade on</h2>
  <p>So W3C did allow encripted media to played in the browser, by
  standardiing Encrypted Media Extensions (EME). This allows the
  web site to get access to one of a small set of DRM gadget on the
  device -- gadgets which the user has no control of.</p>
  <p>Looking back in 2023, there has been a huge amount of
  streaming on the web and off. EME in HTML has been used
  massively. A certain notable part of it is user-generated content
  like YouTube, Vimeo, and TikTok. A huge amount is commercial
  movies, typically nowadays 4k resolution, and TV series, some
  short limited set of episodes as a genre competing with full
  length movies; some going on many series.</p>
  <p>There is [still] a constant compettion between web sites an
  apps. You can follow a link to a video clip on the web, and watch
  it on the web but Netfllix, Youtube, Apple, etc will alwys try to
  get you to switch to the app so they have more control of your
  environment, and can store lots more data on your device.</p>
  <p>When you share a video clip in the app, it typically generates
  a link which will take the recipient back to the web version. You
  can configure your Operating System so that it will recognize
  links direct to the app.</p>
  <p>As a developer, I can still develop code on my Apple laptop
  and still install open source code and random apps written by
  other people on it.</p>
  <p>As an artist, there is no way for me to make my own DRM
  platform if a person wants to protect their material. So they
  have to find a route to the big platforms. Currenly the big
  streaming platforms are infamous for returning very little funds
  to the original artists. I can make my music or video available
  on my website, and give it away for free, or charge for it
  without protecting it. Patronage and live gigs and merchendise
  sales may provide revenue.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Jan 2001 00:00:00 GMT</pubDate>
  <title>The RDF-diff problem</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Diff.html</link>
    <guid>https://www.w3.org/DesignIssues/Diff.html</guid>
      <description><![CDATA[
    <h4>Abstract</h4>
    <p>The problem of updating and synchronizing data in the
    Semantic Web motivates an analog to text diffs for RDF graphs.
    This paper discusses the problem of comparing two RDF graphs,
    generating a set of differences, and updating a graph from a
    set of differences. It discusses two forms of difference
    information, the context-sensitive <q>weak</q> patch, and the
    context-free <q>strong</q> patch. It gives a proposed
    <strong>update ontology</strong> for patch files for RDF, and
    discusses experience with proof of concept code.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Diff.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 27 Jul 2024 00:00:00 GMT</pubDate>
  <title>The Dysfunction of Social Networks</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Dysfunction.html</link>
    <guid>https://www.w3.org/DesignIssues/Dysfunction.html</guid>
      <description><![CDATA[
    <p>In the early days of the internet, there was an optimistic belief that technology
       could foster a wise and self-governing global community. 
     
        However, as social networks emerged and became monopolistic,
        this ideal was undermined. The 2016 elections in the US and Brexit vote
        demonstrated how social media could polarize societies and disrupt democratic processes. 
	Then, the Facebook-Cambridge Analytica scandal
        of 2018 exposed the manipulative power of targeted advertising,
        sparking further widespread concern over data privacy and the influence of 
        social media on public opinion. 
</p><p>
      Various mechanisms of social networks contributed to this dysfunction.
      These include the abuse of personal data through extensive profiling and
      tracking, the spread of misinformation and clickbait, and the optimization of 
      content to maximize user engagement at the expense of truth and societal well-being.
      The consequences of these practices are far-reaching, leading to political polarization,
      mental health issues, and the erosion of trust in democratic institutions.
    </p><p>
      The response to these challenges must be multifaceted, involving efforts from tech companies,
      governments, parents, and activists to mitigate the negative impacts of social networks.
      Crucial elements include transparency, ethical design, and regulatory measures to create 
      a safer and more humane digital environment.
      We need a collective effort to harness the positive potential of the internet
      while addressing its darker aspects.</p>
    <p></p>

  
<p><a href="https://www.w3.org/DesignIssues/Dysfunction.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Apr 1998 00:00:00 GMT</pubDate>
  <title>Intuitive hypertext editing</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Editor.html</link>
    <guid>https://www.w3.org/DesignIssues/Editor.html</guid>
      <description><![CDATA[
  <h1>Cleaning up the User Interface 2: Hypertext editing</h1>
  <p>Tim BL 3 April 1998</p>
  <p>If you think surfing hypertext is cool, that's because you
  haven't tried writing it. If you have found your
  bookmarks/favorites have become a more and more important part of
  your life, that's because you have learned to put up with the
  simplest form of hypertext editing there is as a compromise. If
  you are using a really intuitive hypertext editor, then tell me
  about it.</p>
  <h2>Hypertext editing</h2>
  <p>The Web is universal and so should be able to encompass
  everything across the range from the very rough scribbled idea on
  the back of a virtual envelope to a beautifully polished work of
  art.</p>
  <p>Somewhere near the "draft" end of the scale is its use a
  hypertext communal or personal notebook which is very close to a
  major original use of the Web in 1990. In this mode I can browse
  over notes made by people in my group, and rapidly contribute new
  ideas.</p>
  <p>I'm editing this now on a pretty intuitive editor. AOLPress is
  may not be a top of the line pape layout tool but it can do some
  of the things which my original "WorldWideWeb" program could do.
  I wouldn't say that either of these programs was the ideal
  interface, but if you look also at things like KMS and Doug
  Engelbart's interface, you see that for all the fancy HTML we
  have nowadays, there is some immediacy we have lost.</p>
  <p>Here are some things I would like to be able to do very
  rapidly. Dan Connolly suggested a click count as a way of
  measuring the effort, with 10 clicks penalty when you have to
  think of a filename or anchor ID.</p>
  <h2>Imagine there's no mode, imagine there's no location</h2>
  <p>A first assumption, by the way, is that you have modeless
  interface in which browsing and editing are not separate
  functions. If to edit a page, you have to switch from browsing
  mode to editing mode, then you have lost already. If you have had
  to switch to edit mode, and think of a local filename in which to
  save the file, then you have lost doubly, If you have had to
  answer lots of difficult questions about where to save absolute
  or relative links, you have lost yet again and probably messed up
  the file already! You should not have to think about "where"
  things are.</p>
  <h2>Make a link</h2>
  <p>In WorldWideWeb, you had to</p>
  <ul>
    <li>Select the target phrase</li>
    <li>Hit "command/M" to mark where you were, (Which generated an
    anchor with a made up name, and remembered it);</li>
    <li>Switch to the document to contain the link if
    different;</li>
    <li>select the text to be linked;</li>
    <li>Hit "Command/L" to make the link</li>
  </ul>
  <p>In AOLPress, I can do the same thing except the "Mark"
  function consists of three steps: Press the "anchor" button, hit
  return to accept the program's suggested anchor name, and then
  hit the "copy URL" button.</p>
  <p>In a drag-and drop world, every window should have an icon for
  the document it holds which can be dragged to make a link. (Later
  versions of NeXTStep had this with alt/click on the
  titlebar).</p>
  <h2>Make a new linked node - Annotate</h2>
  <p>In WorldWide Web, this was deliberately easy:</p>
  <ul>
    <li>Select a phrase</li>
    <li>Hit "Command/N". (A new node is created)</li>
    <li>Think of a filename in response to the "SaveAs" dialog box
    :-(</li>
  </ul>
  <p>The new node would be created from a template which could set
  up to have your signature at the bottom, etc. The original phrase
  was automatically linked to the new node. The cursor was left
  ready for you to type in what you'd just thought of.</p>
  <p>In a world with PICS servers, then a neat operation is to
  annotate a page you don't have access to:</p>
  <ul>
    <li>Create a new node somewhere where you have write
    access</li>
    <li>Create a PICS label with a pointer to it</li>
    <li>Store the PICS label on the label server as a label about
    the annotated node.</li>
  </ul>
  <p>The XML LINK work will allow, we hope, a link to be made into
  the middle of an existing unwritable document with some hope of
  reliability.</p>
  <p>Here are a few other operations which would be very useful
  when you really use hypertext as a thinking tool.</p>
  <h2><a name="Excerpt" id="Excerpt">Excerpt</a></h2>
  <p>Dan is always asking for this and doing it by hand. I have
  never seen an editor which will do it automatically (though Dan
  has found some <a href="../2000/08/eb58">javascript hacks</a>
  that work pretty well).</p>
  <ul>
    <li>Copy to the clipboard a BLOCKQUOTE with inside it a copy of
    the selected text, linked back to the original document from
    which it came. Make the link to an existing anchor in the
    document if one is there, or else a new one if one can be made,
    or else the document as a whole failing that.</li>
  </ul>
  <h2>Insert an image</h2>
  <p>It's always nice to be able to grab a screen shot or a video
  frame and insert it into the minutes you are taking of a meeting
  -- but how many keystrokes does it take?</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 May 1998 00:00:00 GMT</pubDate>
  <title>Evolvability</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Evolution.html</link>
    <guid>https://www.w3.org/DesignIssues/Evolution.html</guid>
      <description><![CDATA[
  <h1>Evolvability</h1>
  <h3><a name="Introduction" id="Introduction">Introduction</a></h3>
  <p>The World Wide Web Consortium was founded in 1994 on the
  mandate to lead the <b>Evolution</b> of the Web while maintaining
  its <b>Interoperability</b> as a universal space.
  "Interoperability" and "Evolvability" were two goals for all W3C
  technology, and whilst there was a good understanding of what the
  first meant, it was difficult to define the second in terms of
  technology.</p>
  <p>Since then W3C has had first hand experience of the tension
  beween these two goals, and has seen the process by which
  specifications have been advanced, fragmented and later
  reconverged. This has led to a desire for a technological
  solution which will allow specifications to evolve with the speed
  and freedom of many parallel deevlopments, but also such that any
  message, whether "standard" or not, at least has a well defined
  meaning.</p>
  <p>There have been technologies dubbed "futureproof" for years
  and years, whether they are languages or backplane busses.
  &nbsp;I expect you the reader to share my cynicism when
  encountering any such claim. &nbsp;We must work though exactly
  what we mean: what we expect to be able to do which we could not
  do before, and how that will make evolution more possible and
  less painfull.</p>
  <h2><a name="Free" id="Free">Free extension</a></h2>
  <p>A rule explicit or implcit in all the email-like Internet
  protocols has always been that if you found a mail header (or
  something) which you did not understand, you should ignore it.
  This obviously allows people to add all sorts of records to
  things in a very free way, and so we can call it the rul of free
  extension. It has its advatage of rapid prototyping and
  incremental deployment, and the disadvantage of ambiguity,
  confusion, and an inability to add a mandatory feature to an
  existing protocol. I adopeted the rule for HTML when initially
  designing it - and used it myself all the time, adding elements
  one by one. This is one way in which HTML was unlike a
  conventional SGML application, but it allowed the dramatic
  development of HTML.</p>
  <h3><a name="cycle" id="cycle">The HTML cycle</a></h3>
  <p>The development of HTML between 1994 and 1998 took place in a
  cycle, fuelled by the tension between the competitive urge of
  companies to outdo each other and the common need for standards
  for moving forward. The cycle starts simply simply bcause the
  HTML standard is open and usable by anyone: this means that any
  engineer, in any company or waiting for a bus can think of new
  ways to extend HTML, and try them out.</p>
  <p>The next phase is that some of these many ideas are tried out
  in prototypes or products, using the fact free extension rule
  that any unrecongined extensiosn will be ignored by everything
  which does not understand them. The result is a drmatic growth in
  features. Some of these become product differentiators, during
  which time their originators are loth to discuss the technology
  with the competition. Some features die in the market and
  diappear from the products. Those successful features have a
  fairly short lifetime as product differetiators, as they are soon
  emulated in some equivalent (though different) feature in
  competeing products.</p>
  <p>After this phase of the cycle, there are three or four ways of
  doing the same thing, and engineers in each company are forced to
  spend their time writing three of four different versions of the
  same thing, and coping with the software architectural problems
  which arise from the mix of different models. This wastes program
  size, and confuses users. In the case for example, of the TABLE
  tag, a browser meeting one in a document had no idea which table
  extension it was, so the situation could become ambiguous. If the
  interpretation of the table was important for the safe
  interpretation ofthe document, the server would never know
  whether it had been done, as an unaware client would blithely
  ignore it in any case. This internal software mess resulting from
  having to implement multiple models also threatens future
  deevlopment. It turns the stable consistent base for future
  development into something fragmented and inconsistent: it is
  difficult to design new features in such an environment.</p>
  <p>Now the marketting pressure is off which prevented
  discussions, and there is a strong call for the engineers to get
  around the W3C table, and iron out a common way of doing things.
  As this happens, a system is designed which puts together the
  best aspects of each other system, plus a few weeks experience,
  so everyone is in the end happier with the result. The companies
  all go away making public promises to implement it, even though
  the engineering staff will be under pressure to add the next
  feature and startthe next cycle. The result is published as a
  common specification opene to anyone to implement. And so the
  cycle starts again.</p>
  <p>This is not the way all W3C activities have worked, but it was
  the particular case with HTML, and it illustrates some of the
  advantages and disadvantages with the free extenstion rule.</p>
  <h3><a name="Breaking" id="Breaking">Breaking the cycle</a></h3>
  <p>The HTML cycle as a method of arriving at consensus on a
  document has its drawbacks. By 1998, there were reasons to change
  the cycle.The work in the W3C, which had started off in 1994 with
  several years backlog of work, had more or less caught up, and
  was begining to lead, rather than trail, developments. The work
  was seen less as fire fighting and more as consolitation. By this
  time the spec was growing to a size where the principle of
  modularity was seriously flaunted. Any new developments clearly
  had to be seperate modules. Already style information had been
  moved out into the Cascading Style Sheets language, the
  programming interface work was a seperate Document Object Model
  activity, and guidelines for accessibility were tackled by a
  seperate group.</p>
  <p>Inthe future it was clear that we needed somehow to set up a
  modular system which would allow one to add to HTML new standard
  modules. At the same time, it was clear that with XML available
  as a manageble version of SGML as a base for anyone to define
  their own tag sets, there was likely to be a deluge of
  application-specific and industry-specific XML based languages.
  The idea of all this happening underthefree extension rule was
  frightening. Most applications would simply add new tags to HTML.
  If we continued the process of retrospectively roping into a new
  bigger standard, the document would grow without limit and become
  totally unmanageble. The rule of free extesnion was no longer
  appropriate.</p>
  <h1><a name="wdi" id="wdi">Well defined interfaces</a></h1>
  <p>Now let us compare this situation with the way development
  occus in the world of distributed computing, specifically remote
  rpocedure call (RPC) and distributed object oriented systems. In
  these systems, the distributed system (equivalent to the server
  plus the client for the web) is viewed as a single software
  system which happens to be spread over several physical machines.
  [nelson - courier, etc]</p>
  <p>The network protocols are defined automatically as a function
  of the software interfaces which happen to end up being between
  modules on different machines. Each interface, local or remote,
  has a well documented structure, and the list of functions
  (procedures, methods or whatever) and parameters are defined in
  machine-processable form. As the system is built, the compiler
  checks that the interfaces required by one module is exactly
  provided by another module. The interface, in each version of its
  development, typically has an identifying (typically very long)
  unique number.</p>
  <p>The interface defines the parameters of a remote call, and
  therefore defines exactly what can occur in a message from one
  module to another. There is no free extension. If the interface
  is changed, and a new module made, any module on the other side
  of the interface will have to be changed too, or you can't build
  the system.</p>
  <p>The great advantage of this is that when the system has been
  built, you expect it to work. There is no wondering wether a
  table is being displayed - if you have called the table module,
  you know exactly what the module is supposed to do, and there is
  no way the system could be without that module. Given the chaos
  of the HTML devleopment world, you can imagine that many people
  were hankering after the well defined interfaces of the
  distributed computing technology.</p>
  <p>With well-defined interfaces, either everything works, or
  nothing. This was in fact at least formally the case with SGML
  documents. Each had a document type definition (DTD) refered to
  at the the top, which defiend in principle exactly what could and
  could not be in the document. PICS labels were similar in that
  thet are self-describing: they actually have a URI atthe top
  which points to a machine-readable description of what can and
  can't be in athat PICS label. When you see one of these
  documents, as when you get an RPC mesaage with an interface
  number on it, you can check whether you understand the interface
  or not. Another intersting thing you can do, if you don't have a
  way of processing it, is to look it up in some index and
  dynamically download the code to process it.</p>
  <p>The existence of the Web makes all this much smoother: instead
  of inventing arbitrary names for inetrfaces, tyou can use a real
  URI which can be dereferenecd and return the master definition of
  the interface in real time. The Web can become a decentralised
  registray of interfaces (languages) and code modules.</p>
  <p>The need was clearly for the best of both worlds. One must be
  able to freely extend a language, but do so with an extension
  language which is itself well defined. If for example, documents
  which were HTML 2.0 plus Netscape's version of tables version
  2.01 were identified as such, mcuh o the problem of ambiguity
  would have been resolved, but the rest ofthe world left free to
  make their own table extensions. This was the goal of the
  namespaces work in XML.</p>
  <h3><a name="ModularityInHTML" id="ModularityInHTML">Modularity
  in HTML</a></h3>
  <p>To be able to use the namespaces work in the extension of
  HTML, HTML has to transition from being an SGML application (with
  certain constraints) to being an XML based langauge. This will
  not only give it a certain ease of parsing, but allow it to build
  on the modularity introduced by namespaces.</p>
  <p>In fact, already in April of 1998 there was a W3C
  Recommendation for "MathML", defined as as XML langauge and
  obviously aimed at being usable in the context of an HTML
  document, but for which there was no defined way to write a
  combined HTML+MathML document. MathML was already waiting for XML
  namespaces.</p>
  <p>XML namespaces will allow an author (or authoring tool,
  hopefully) to declare exactly waht set of tags he orshe is using
  in a document. Later, schemas should allow a browser to decide
  what to do as a fall back when finding vocabulary which it does
  not understand.</p>
  <p>It is expected that new extensions to HTML be introduced as
  namespaces, possibly languages in their own right. The intent is
  that the new languages, where appropriate, will be able to use
  the existing work on style sheets, such as CSS, and the existing
  DOM work which defines a programming interface.</p>
  <h2><a name="Mixing" id="Mixing">Language mixing</a></h2>
  <p>Language mixing is an important facility, for HTML, for the
  evolution of all other Web and application technology. It must
  allow, in a mixed labguage document, for both langauges to be
  well defined. A mixed langage document is quiote analogous to a
  program which makes calls to two runtime libraries, so it is not
  rocket science. It is not like an RPC message, which in most
  systems is very strongly ytped froma single rigid definition. (An
  RPC message can be represented as a structured document but not,
  in general, vice-versa)</p>
  <p>Language mixing is a realtity. Real HTML pages are often HTML
  with Javascript, or HTML plus CSS, or both. They just aren't
  declared as such. In real life, many documents are made from
  multiple vocabularies, only some of which one understands. I
  don't understand half the information in the tax form - but I
  know enough to know what applies to me. The invoice is a good
  example. Many differet coloured copies of the same document used
  to serve as a packing list, restocking sheet, invoice, and
  delivery note. Different parts of a company would understand
  different bits: the financial dividion woul dcheck amounts and
  signatures, the store would understand the part numbers, and the
  sales and marketting would define dthe relationship betwene the
  part numbers and prices.</p>
  <p>No longer can the Web tolerate the laxness which HTML and HTTP
  have been extended. However, it cannot constrain itself to a
  system as rigid as a classical disributed object oriented
  system.</p>
  <p>The <a href="Extensible.html">note on namespaces</a> defines
  some requirements of a language framework which allows new
  schmata to be developed quite independently, and mixed within one
  document. This note elaborates on the sorts of things which have
  to be possible when the evolution occurs.</p>
  <h3><a name="power" id="power">The Power of schema
  languages</a></h3>
  <p>You may notice than nowhere in the architecture do XML or RDF
  specify what language the schema should be written in. This is
  because much of the future power of the system will lie in the
  power of the schema and related documents, so it isimportant to
  leave that open as a path for the future. In the short term, yo
  can think of a schema being written in HTML and english. Indeed,
  this is enough to tie the significance of documents written in
  the schema to the law of the land and mkae the document an
  effective part of serious commercial or other social interaction.
  You can imagine a schema being in a sort of SGML DTD language
  which tells a computer program what constraints there are on the
  structure of documents, but nothing about their meaning. This
  allows a certain crude validity check to be made on a document
  but little else.</p>
  <p>Now let us imagine further power which we could put into a
  schema language.</p>
  <h2><a name="PartialUnderstanding" id="PartialUnderstanding">Partial Understanding</a></h2>
  <p>A crucial first milestone for the system is partial
  understanding. Let's use the scenario of an invoice, like the
  <a href="Extensible.html#Scenario">scenario in the "Extensible
  languages" note</a>. An invoice refers to two schemata: one is a
  well-known invoice schema and the other a proprietory part number
  schema. The requirement is that an invoice processing program can
  process the invoice without needing to understand the part
  description.</p>
  <p>Somehow the program must find out that the invoice is from its
  point of view just as valid as an invoice with the details fo the
  part description stripped out.</p>
  <h3><a name="Optional" id="Optional">Optional parts</a></h3>
  <p>One possibility is to mark the part description as "optional"
  on the text. We could imagine a well-known way of doing this. It
  could be done in the document itself [as usual, using an
  arbitrary syntax:]</p>
  <pre>&lt;item&gt;
&lt;partnumber&gt;8137498237&lt;/&gt;
&lt;optional&gt;
 &lt;xml:using href="http://aeroco.com/1998/03/sy4" as="A"&gt;<br>
   &lt;a:partdesc&gt;
        ...
   &lt;a:partdesc&gt;
 &lt;/xml:using&gt;<br>
&lt;/opional&gt;
&lt;/item&gt;
</pre>
  <p>There are problems with this. One is that we are relying on
  the invoice schema to define what in invoice is and isn't and
  what it means. It would be nice if the designer of the invoice
  could say whether the item should contain a part description of
  not, or whether it is possible to add things into the item
  description or not. But in general if there is something to be
  said we like to allow it to be said anywhere (like metadata). But
  for the optionalness to be expressed elsewhere would save the
  writer of every invoice the bother of having to explicitly.</p>
  <h3><a name="Partial" id="Partial">Partial Understanding</a></h3>
  <p>The other more fundamental problem is that the notion of
  "optional" is subjective. We can be more precise about "partial
  understanding" by saying that the invoice processing system needs
  to convert the document which contains things it doesn't
  understand into a document which it does completely understand: a
  valid invoice. However, another agent may which to convert the
  same detailed invoice into, say, a delivery note: in this case,
  quite different information would be "optional".</p>
  <p>To be more specific, then, we need to be able to describe a
  transformation from one document to another which preserves
  "valididy" in some sense. A simple form of transformation is the
  removal of sections, but obviously there can be all kinds of
  level of transformation language ranging from the cudest to
  theturing complete. Whatever the language, statement that given a
  document x, that some f(x) can be deduced.</p>
  <h3><a name="Least" id="Least">Principle of Least Power</a></h3>
  <p>In practice, this suggest that one should leave the actual
  choice of the transformation language as a flexibility point.
  However, as with most choices of computer language, the general
  "principle least power" applies:</p>
  <table border="1" cellpadding="2">
    <tbody>
      <tr>
        <td>When expressing something, use the least powerful
        language you can.</td>
      </tr>
    </tbody>
  </table>
  <p><i>(@@justify in greater depth in footnote)</i></p>
  <p>While being able to express a very complex function may feel
  good, the result will in general be less useful. As Lao-Tse puts
  it, "<a href="Evolution.html#within">Usefulness from what is not
  there</a>". From the point of view of translation algorithms, one
  usefulness is for them to be reversible. In the case in which you
  are trying to prove something (such as access to a web site or
  financial credibility) you need to be able to derive a document
  of a given form. The rules you use are the pieces of the web of
  trust and you are looking for a path through the web of trust.
  Clearly, one approach is to enumerate all the things which can be
  deduced from a given document, but it is faster to have an idea
  of which algorithms to apply. Simple ones have input and output
  patterns. A deletion rule is a very simple case</p>
  <p align="center">s/.*foo.*/\1\2/</p>
  <p>This is stream editor languge for "Remove "foo" from any
  string leaving what was on either side". If this rule is allowed,
  it means that "foo"is optional. @@@ to be continued</p>
  <p>Optional features and Partial Understanding</p>
  <ul>
    <li>Goal: V1 software partially understands V2 document</li>
    <li>Optional features visible as such</li>
    <li>Example: "Mandatory" Internet Draft</li>
    <li>Example: SMIL (P.Rec. 1998/4/9)</li>
    <li>Conversion from unknown language to known language.</li>
  </ul>
  <h1><a name="ToII" id="ToII">Test of Independent
  Invention</a></h1>
  <p>The test of independent invention is a thought experiment
  which tests one aspect of the quality of a design. When you
  design something, you make a number of important architectural
  decisions, such as how many wheels a car has, and that an arch
  will be used between the pillas of the vault. You make other
  arbitrary decisions such as the color of the car, the side of the
  road everyone will drive, whether to open the egg at the big end
  or the little end.</p>
  <p>Suppose it just happens that another group is designing the
  same sort of thing, tackling the same problem, somewhere else.
  They are quite unknown to you and you to them, but just suppose
  that being just as smart as you, they make all the same important
  archietctural decisions. This you can expect if you believe hat
  these decisions make logical sense. Imagine that they have the
  same philosophy: it is largely the philosophy which we are
  testing. However, imagine that they make all the arbitrary
  decisions differently. They complement bit 7. They drive on the
  other other side of the road. They use red buoys on the starbord
  side, and use 575 lines per screen on their televisions.</p>
  <p>Now imagine that the two systems both work (locally), and
  being usccessful, grow and grow. After a while, they meet.
  Suddenly you discover each other. Suddenly, people want to work
  across both systems. They want to connect two road systems, two
  telephone systems, two networks, two webs. What happens?</p>
  <p>I tried originally to make WWW pass the test. Suppose someone
  had (and it was quite likely) invented a World Wide Web system
  somewhere else with the same principles. Suppose they called it
  the Multi Media Mesh <sup>(tm)</sup> and based it on Media
  Resource Identifiers<sup>(tm)</sup>, the MultiMedia Transport
  Protocol<sup>(tm)</sup>, and a Multi Media Markup
  Language<sup>(tm)</sup>. After a few years, the Web and the Mesh
  meet. What is the damage?</p>
  <ul>
    <li>A huge battle, involving the abandonment of projects,
    conversion or loss of data?</li>
    <li>Division of the world by a border commission into two
    separate communities?</li>
    <li>Smooth integration with only incremental effort?</li>
  </ul>
  <p>(see also <a href="../People/Berners-Lee/UU.html">WWW and
  Unitarian Universalism</a>)</p>
  <p>Obviously we are looking for the latter option. Fortunately,
  we could immediately extend URIs to include "mmtp://" and extend
  MRIs to include "http:\\". We could make gateways, and on the
  better browsers immediately configure them to go through a
  gateway when finding a URI of the new type. The URI space is
  universal: it covers all addresses of all accessible objects. But
  it does not have to be the only universal space. Universal, but
  not unique. We could add MMML as a MIME type. And so on. However,
  if we required all Web servers to synchronise though one and only
  one master lock server in Waltdorf, we would have found the Mesh
  required synchronisation though a master server in Melbourne. It
  would have failed.</p>
  <p>No system completely passes the ToII - it is always some
  trouble to convert.</p>
  <h3><a name="real" id="real">Not just a thought
  experiment</a></h3>
  <p>As the Web becomes the basis for many many applications to be
  build on top of it, the phenomenon of independent invention will
  recur again and again. We have to build technology so as to make
  it easy for systems to pass the test, and so survive real life in
  an evolving world.</p>
  <p>If systems cannot pass the TOII, then we can only achieve
  worldwide interoperability when one original design has
  originally beaten the others. This can happen if we all sit down
  together as a worldwide committee and do a "top down"design of
  the whole thing before we start. This works for a new idea but
  not for the automation of something which, like pharmacy or
  trade, has been going on for centuries and is just being
  represented in the Semantic Web. For example, the library
  community has had endless trouble trying to agree on a single
  library card format (MARC record) worldwide.</p>
  <p>Another way it can happen is if one system is dropped
  completely, leading to a complete loss of the effport put into
  it. When in the late 1980s Europe eventually abandoned its suite
  of ISO protocols for networking because they just could not
  interwork with the Internet, a huge amount of work was lost. Many
  problems, solved in Europe but not in the US (including network
  addresses of more than 32 bits) had to be solved again on the
  Internet at great cost. Sweden actually changed from driving on
  the left to driving on the right. All over the world, people have
  changed word processor formats again and again but only at the
  cost of losing access to huge amounts of legacy information. The
  test of independent invention is not just a thought experiment,
  it is happening all the time.</p>
  <h1><a name="requirements" id="requirements">From philosophy to
  requirement</a></h1>
  <p>So now let us get more specific about what we really need in
  the underlying technology of the Semantic Web to allow systems in
  the future to pass the test of independent invention.</p>
  <h3><a name="smarter" id="smarter">We will be smarter</a></h3>
  <p>Our first assumption is that we will be smarter in the future.
  This means that we will produce better systems. We will want to
  move on from version 1 to version 2, from version n to version
  n+1.</p>
  <p>What happens now? A group of people use version 4 of a word
  process and share some documents. One touches a document using a
  new version 5 of the same program. Oen of the other people tries
  to load it using version 4 of the software. The version 4 program
  reads the file, and find it is a version5 file. It declares that
  there is no way it can read the file,as it was produced in the
  future, and there is no way it can predict the future to know how
  to read a version 5 file. A flag day occurs: everyone in the
  group has to upgrade immediately - and often they never even
  planned to.</p>
  <p>So the first requirement is for a version 4 program to be able
  to read a version 5 file. Of course there will be some features
  in version 5 that the version 4 program will not be able to
  understand. But most of the time, we actually find that what we
  want to achieve can be done by partial understanding -
  understanding those parts of the document which correspond to
  functions which exist in version 4. But even though we know
  partial understanding would be acceptable, with most systems we
  don't know how to do even that.</p>
  <h3><a name="others" id="others">We are not the smartest</a></h3>
  <p>The philosophical assumption that we may not be smarter than
  everyone else (a huge step for some!) leads us to realise that
  others will have gret ideas too, and will independently invent
  the same things. It forces us to consider the test of independent
  invention.</p>
  <p>The requirement for the system to pass the ToII is for one
  program which we write to be able to read somehow (partially if
  not totally) data written by the program written by the other
  folks. This simple operation is the key to decentralised
  evolution of our technology, and to the whole future of the
  Web.</p>
  <p>So we have deduced two requirements for the system from our
  simple philosophical assumptions:</p>
  <ul>
    <li>We will be smarter in the future
      <ul>
        <li>Technology: Moving Version 1 to Version 2</li>
      </ul>
    </li>
    <li>We are not smarter than everyone else
      <ul>
        <li>Decentralized evolution</li>
        <li>Technology: Moving between parallel Version A and
        Version B</li>
      </ul>
    </li>
  </ul>
  <h3><a name="sofar" id="sofar">The story so far</a></h3>
  <p>We are we with the requirements for evolvability so far? We
  are looking for a tecnology which has free but well defined
  extension. We want to do it by allowing documents to use mixed
  vocabularies. We have already found out (from PICS work for
  example) that we need to be abl eto know whether extension
  vocabulary is mandatory or can be ignored. We want to use the Web
  for any registry, rather than any central point. The technology
  has to be allow an application to be able to convert the output
  of a future version of itself, or the output of an equivalent
  program written indpendently, into something it can process, just
  by looking up schema information.</p>
  <h2><a name="data" id="data">Evolution of data</a></h2>
  <p>Now let us look at the world of data on the Web, the <a href="Semantic.html">Semantic Web</a>, which I expect we expect to
  become a new force in the next few years. By "data" as opposed to
  "documents", I am talking about information on the Web in a form
  specifically to aid automated processing rather than human
  browsing. "Data" is characterised by infomation with a well
  defined strcuture, where the atomic parts have wel ldefined
  types, such as numbers and choices from finite sets. "Data", as
  in a relational database, normally has well defined meaning which
  has rarely been written down. When someone creates a new databse,
  they have to give the data type of each column, but don't have to
  explain what the field name actually means in any way. So there
  is a well defined semantics but not one which can be accessed. In
  fact, the only time you tells the machine anything about the
  semantics is when you define which two columns of different
  tables are equivalent in some way, so that they can be used for
  example as the basis for joining the two databases. (That the
  meaning of data is only defined relative to the meaning of other
  data is of course quite normal - we don't expect machines to have
  any built in understanding of what "zip code" might mean apart
  from where you can read it and write it and what you can compare
  it with). Notice that what happens with real databases is that
  they are defined by users one day, and they evolve. They are
  rarely the result of a committee sitting down and deciding on a
  set of concepts to use across a company or an industry, and then
  designing the data schema. The schema is craeted on the fly by
  the user.</p>
  <p>We can distinguish two ways in which tha word "schema" has
  been used:</p>
  <table border="1" cellpadding="2">
    <tbody>
      <tr>
        <td>Syntactic Schema: A document, real or imagined, which
        constrains the structure and/or type of data. <i>(pl.:
        Schemata)</i>.</td>
      </tr>
    </tbody>
  </table>
  <table border="1" cellpadding="2">
    <tbody>
      <tr>
        <td>Semantic schema: A document, real or imagined, which
        defines the infereneces from one schema to another, thus
        defining the semantics of one syntactic schema in terms of
        another.</td>
      </tr>
    </tbody>
  </table>
  <p>I will use it for the first only. In fact, a syntactic schema
  dedfines a class of document, and often is accompanied by human
  documentation which provides some rough semantics.</p>
  <p>There is a huge amount ("legacy" would unfairly suggest
  obsolescence) of data in relational databases. A certain amount
  of it is being exported onto the web as virtual hypertext. There
  are many applications which allow one to make hypertext views of
  difeferent aspects of a database, so that each server request is
  met by performing adatabse query, and then formatting the result
  as a report in HTML, with appropriate style and decoration.</p>
  <h2>Data about data: Metadata</h2>
  <p>Information about information is interesting in two ways.
  Firstly, it is interesting because the Web society desperately
  needs it to be able to manage social aspects of information such
  as endorsement (PICS labels, etc), ownership and access rights to
  information, privacy policies (P3P, etc), structuring and
  cataloguing information and a hundred otehr uses which I will not
  try to ennumerate. This first aspect is discussed elsewhere. (See
  <a href="http://www.w3.org/DesignIssues/Metadata.html">Metadata
  architecture</a> about general treatment of metadata and labels,
  and the <a href="../TandS/Overview.html">Technology and Society
  domain</a> for overveiw of many of the social drivers and related
  projects and technology)</p>
  <p>The second interest in metadata is that it is data. If we are
  looking for a language for putting data onto the Web, in a
  machine understandable way, then metadata happens to be a first
  application area. Also, because metadat ais fundamental to most
  data on eth web, it is the focus of W3C effort, while many other
  forms of data are regarded as applications rather than core Web
  archietcure, and so are not.</p>
  <h3>Publishing data on the web</h3>
  <p>Suppose for example that you run a server which provides
  online stock prices. Your application which today provides fancy
  web pages with a company's data in text and graphs (as GIFs)
  could tomorrow produce the same page as XML data, in tabular
  form, for machine access. The same page could even be produced at
  the same URL in two formats using content negotiation, or you
  could have a typed link between the machine-understandable and
  person-understandable versions.</p>
  <p>The XML version contains at the top (or soemewhere) a pointer
  to a schema document. This poiner makes the document
  "self-describing". It is this pointer which is the key to any
  machine "understanding" of the page. By making the schema a first
  class object, in other words by giving its URL and nothing else,
  we are leaving the dooropen to many possibilities. Now it is time
  to look at the various sorts of schema document which it could
  point to.</p>
  <h2>Levels of schema language</h2>
  <p>Computer languags can be classified into various types, with
  various capabilities, and the sort we chose for the schema
  document, and information we allow the schema fundamentally
  affects not just what the semantic web can be but, more
  importantly, how it can grow.</p>
  <p>The schema document can, broadly, be one of the following:</p>
  <ol>
    <li>Notional only: imaginary, non-existent but named.</li>
    <li>Human readable</li>
    <li>Machine-understandable and defining structure</li>
    <li>Machine-understandable and slo which are optional
    parts</li>
    <li>A Turing-complete recipe for conversion into othr
    langauges</li>
    <li>A logical model of document</li>
  </ol>
  <p>We'll go over the pros and cons of each, because none of these
  should be overlooked, but some are often way better than
  others.</p>
  <h3>Schema 1: URI only</h3>
  <ul>
    <li>No supporting documentation</li>
    <li>Allows compatibility yes/no test</li>
  </ul>
  <p>This may sound like a silly trivial example, but like many
  trival examples, it is not silly. If you just name your schema
  somewhere in URI space, then you have identified it. This deosn't
  offer a lot of help to anyone to find any documentation online,
  but one fundamental function is possible. Anyone can check
  compatability: They can compare the schema against a list of
  schemata they do understand, and return yes or no.</p>
  <p>In fact, they can also se an idnex to look up information
  about the schema, including ifnromation about suitable software
  to download to add understanding of the document. In fact this
  level is the level which many RPC systems use: the interface is
  given a unique but otherwise random number which cannot be
  dereferenced directly.</p>
  <p>So this is the level of machine-understanding typical of
  distributed ocmputing systems and should not be underestimated.
  There are lot sof parts of URI space you can use for this: yo
  might own some http: space (but never actually serve the document
  at that point) , but if you don't, you can always generate a URI
  in a mid: ro cid: space or if desperate in one of the hash
  spaces.</p>
  <h3>Schema option 2: Human readable</h3>
  <p>The next step up from just using the Schema identifier as a
  document tyope identifier is to make that URI one which will
  dereference to a human-readable document. If you're a computer,
  big deal. But as well as allowing a strict compatiability test
  (test for equality of the schema URI), this also allows human
  beings to get involed if ther is any argument as to what a
  document means. This can be signifiant! For example, the schema
  could point to a complete technical spec which is crammed with
  legalese about what the document does and does not imply and
  commit to. At the end of the day, all machine-understandable
  descriptions of documents are all very well, but until the day
  that they bootstrap themselves into legality, they must all in
  the end be defined in terms of human-readable legalese to have
  social effect. Human legalese is the schema language of our
  society. This is level 2.</p>
  <h3>Schema option 3: Define structure</h3>
  <p>Now we move into the meat of the schema system when we start
  to discuss schema documents which are machine readable. now we
  are satrting to enable some machine understanding and automatic
  processing of document types which have not been pre-programmed
  by people. Ça commence.</p>
  <p>The next level we conside is that when your brower (agent,
  whatever) dereferences the namespace URI, it find a schema which
  defines the structure of the document. this is a bit like an SGML
  Doctument type Definition (DTD). It allows you to do everything
  which the levels 1 and 2 allowed, if it has sufficient comments
  in it to allow human arguments to be settled.</p>
  <p>In addition, a system which has a way of defineing structure
  allows everyone to have one and only one parser to handle all
  manner of documents. Any document coming across the threshold can
  be parse into a tree.</p>
  <p>More than that, it allows a document o be validated against
  allowed strctures. If a memeo contains two subject fields, it is
  not valid. Tjis is one fo the principal uses of DTDs in SGML.</p>
  <p>In some cases, there maybe another spin-off. You canimagine
  that if the schema document lists the allwoed structrue of the
  document, and the types (and maybe names) of each element, then
  this would allow an agent to construct on the fly a graphic user
  interafce for editing such a document. This was theintent with
  PICS rating systems: at least, a parent coming across a new
  rating system would be be given a ahuman-readable descriptoin of
  the various parameters and would be able to select</p>
  <h3>Schema option 4: Structure + Optional flags</h3>
  <p>The "optional" flag is a term I use here for a common crucial
  step which can make the difference between chaos and smooth
  evolution. All you need to do is to mark in the schema of a new
  version of the language which elements of the langauge can be
  ignored if you don't understand them. This simple step allows a
  processor which handled the old language, giventhe schema of the
  new langauge, to filter it so as to produce a document it can
  legitimately understand.</p>
  <p>Now we have a technology which ahs all the benefits to date,
  plus it can handle that elusive <strong>version 2 to version 1
  conversion</strong> problem!</p>
  <h3>Schema option 5: Turning complete language</h3>
  <p>Always in langauges there is the balance between the
  declarative limited langauge, whose foprmulae can be easily
  manipulated, and the powerful programming language whose programs
  cannot be analyzed in general, but which have to be left to run
  to see what they do. Each end of the spectrum has its benefits.
  In describing a lanuage in terms of another, one way is to
  provide a black box program, say in Java or Javascript, which
  will convert from one to the other.</p>
  <p>Filters written in turing-complete languages generally have to
  be trusted, as you can't see what rules they are based on by
  looking at them. But they can do weird and wonderful things.
  (They can also crash and loop forever of course!).</p>
  <p>A good language for conversion from one XML-based language to
  another is XSL. It lstarted off as a template-like system for
  building one document from another (and can be very simple) but
  is in fact Turning-complete.</p>
  <p>When you do publish a program to convert language A to
  language B, then anyone who trusts it has that capability. A
  disadvantage is that they never know how it works. You can't
  deduce things about the individual components of the languages.
  You can't therefore infer much indirectly about relationships to
  other languages. The only way such a filter can be used is to get
  whatever you have into language A and then put it though the
  filter. This might be useful. But it isn't as fascinating as the
  option of blowing language A open.</p>
  <h3>Schema option 6: Expose logic of document</h3>
  <p>What is fundamentally more exciting is to write down as
  explicitly as posible wahteth new language means. Sorry, let me
  take that back, in case you think that I am talking about some
  absulte meaning of meaning. If you know me, I am not. All I mean
  is that we write in a machine-processable logical way the
  equivalences and conversions which are possible in and out of
  language A from other languages. And other languages.</p>
  <p>A specific case of course, is when we document the
  relationship betwen version 2 and version 1. The schema document
  for version 2 could explain that all the terms are synonyms,
  except for some new terms which can be converted to nothing (ie
  are optional) and some which affect the meaning of the document
  completely and so if you don't understand them you are stuck.</p>
  <p>In a more general case, take a language like iCalendar in RDF
  (were it in RDF), which is for describing events as would be in a
  personal organizer. A schema for the language might declare
  equivalences betwen a calendar's concept of group MEMBER ship and
  an access control system's concept of group membership; it might
  declare the equivalence of eth concept of LOCATION to be the text
  description of a Geographical Information Systems standard's
  location, and it may declare an INDIVIDUAL to be a superset of
  the HR department's concept of employee. These bits of
  information of the stuff of the semantic web, as they allow
  inference to stretch across the gloabe and conclude things which
  we knew as whole but no one person knew. This is what RDF and the
  Semnatic Web logic built on top of it is all about.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Feb 1998 00:00:00 GMT</pubDate>
  <title>Extensible languages and web evolution</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Extensible.html</link>
    <guid>https://www.w3.org/DesignIssues/Extensible.html</guid>
      <description><![CDATA[
    <p><a href="Overview.html">Up to Design Issues</a></p>
    <h2>Contents</h2>
    <p>Extensible languages</p>
    <ul>
      <li>
        <a href="#Introduction">Introduction</a>
      </li>
      <li>
        <a href="#Requirements">Requirements</a>
        <ul>
          <li>
            <a href="#Glossary">Glossary</a>
          </li>
          <li>
            <a href="#Mixing">Mixing vocabularies</a>
          </li>
          <li>
            <a href="#Scenario">Scenario</a>
          </li>
          <li>
            <a href="#Local">Local scope</a>
          </li>
          <li>
            <a href="#Ambiguity">Lack of ambiguity</a>
          </li>
          <li>
            <a href="#Evolving">Evolving new scheme languages</a>
          </li>
          <li>
            <a href="#Correctness">Correctness of documents with
            multiple vocabularies</a>
          </li>
          <li>
            <a href="#Granularity">Granularity</a>
          </li>
          <li>
            <a href="#Incorporation">Incorporation into the
            language</a>
          </li>
        </ul>
      </li>
      <li>
        <a href="#Related">Related resources</a>
      </li>
    </ul>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Jan 2021 00:00:00 GMT</pubDate>
  <title>Feeds:</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Feeds.html</link>
    <guid>https://www.w3.org/DesignIssues/Feeds.html</guid>
      <description><![CDATA[
  <h1>Feeds</h1>
  <div>
    <hr>
    <div class="cols">
      <p>Feeds of various sorts have been a feature since you could
      first subscribe to blogs using various forms of RSS. Let's
      call a feed a sequence of published things such that you can
      subscribe to, with in some cases a mechanism to inform you
      when the is more added to it. So a feed by itself is a
      one-way thing with no feedback.</p>
      <p>While blog feeds were the rage at one point, podcasts took
      over the limelight with the eponymous iPod, and now it seems
      people happy move in between, and sometimes convert between,
      text, audio and video blogs, but fitness session and photo
      streams so not play in the same space.</p>
      <p>Here is is a very rough summary of some existing feeds in
      2021, and somethings which don't have feeds.</p>
      <table>
        <tbody><tr>
          <th>Medium</th>
          <th>Post format</th>
          <th>Dominant platform</th>
          <th>Response actions</th>
        </tr>
        <tr>
          <td>Text blog</td>
          <td>HTML</td>
          <td>--</td>
          <td>Blog comments</td>
        </tr>
        <tr>
          <td>Photo</td>
          <td>JPEG</td>
          <td>Instagram</td>
          <td>Like, Comment</td>
        </tr>
        <tr>
          <td>Audio podcast</td>
          <td>MP3</td>
          <td>--</td>
          <td>--</td>
        </tr>
        <tr>
          <td>Video podcast</td>
          <td>MP4?</td>
          <td>YouTube</td>
          <td>Comment</td>
        </tr>
        <tr>
          <td>Movie</td>
          <td>IMDB-RDF</td>
          <td>Netflix, Green Tomatoes &gt; Media Kraken</td>
          <td>Rating (GT)</td>
        </tr>
        <tr>
          <td>Book</td>
          <td>LoC RDF?</td>
          <td>Amazon</td>
          <td>5 Star rating</td>
        </tr>
        <tr>
          <td>Fitness</td>
          <td>GPX</td>
          <td>Strava, Fitbit etc</td>
          <td>Kudos, comment</td>
        </tr>
      </tbody></table>
      <p>Strava and Instagram, as closed platforms, manage the
      identity of their users and their feeds, in each case one
      user per feed, with the access control of who can see what,
      and social actions like likes/kudos and comments on a post.
      In each case positive feedback can be private or public.</p>
      <h2>Activity Streams</h2>
      <p>"Little "a" activity streams : are a UI paradigm for
      displaying recent activity within a context. Activities are
      typically displayed in reverse chronological order and
      consist of relatively simple statements such as "John
      uploaded a new photo" or "12 people liked Sally's post"."" --
      the <a target="refs" href="https://www.w3.org/wiki/Activity_Streams">AS
      wiki</a></p>
      <p>Activity Steams activities are quite broad, not just
      publishing something. From the <a target="refs" href="https://www.w3.org/ns/activitystreams-owl.ttl">OWL
      ontology</a></p>
      <pre>  as:Activity a owl:Class ;
    rdfs:label "Activity"@en ;
    rdfs:subClassOf as:Object ;
    rdfs:comment "An Object representing some form of Action that has been taken"@en .
</pre>
      <p>And there are 22 subclasses defined: Accept,
      IntransitiveActivity, Add, Announce, IntransitiveActivity,
      Create, Delete, Dislike, Flag, Follow, Ignore, Join, Leave,
      Like, View, Listen, Read, Move, Offer, Reject, Remove, Undo,
      Update. This reminds one of social Actions which have ben
      pulled out in schema.org, with subclasses AchieveAction
      AssessAction, ConsumeAction, ControlAction, CreateAction,
      FindAction, InteractAction, MoveAction, OrganizeAction,
      PlayAction, SearchAction, SeekToAction, SolveMathAction,
      TradeAction, TransferAction, and UpdateAction</p>
      <p>GitHub feed wold maybe have things like requested review,
      started review, completed review, approved. In general, any
      collaborative system in which the shared state changes form
      time to time, like for example an issue tracker or action
      list, then those state-based systems cold well be deemed to
      create Activity Stream events whenever that state
      changes.</p>
      <p>By agreeing on common interoperable representation and
      workflows for social actions, the Solid world can allow the
      same smooth user experience no matter what it it they are
      reacting to --liking, and so on.</p>
      <p>The solid mantra that "you should be able to anything with
      anything" clearly suggests here that whatever that thing is,
      a person out to be able to record their reaction, and so
      on.</p>
      <h3>Scope of social reaction</h3>
      <p>Clearly the scope of a social reaction, whether it public
      or personal and confidential or shared with different groups,
      different communities, is key. With Instagram and Strava each
      person has one group of followers, so a choice may be not
      sharing, sharing with followers, or making something
      completely public. In a Solid world, where a person has many
      related groups of different sizes, and can be a member of
      different communities for different reasons, making sure the
      reaction has the right scope is very important.</p>
      <h3>Possible harmful consequences</h3>
      <p>There is research, and common experience, that suggests
      that social reaction systems like these can lead to harmful
      unintended consequences, or harmful processes can be set up,
      deliberately or subconsciously. Examples are people's
      unhealthy preoccupation with the public reactions to their
      posts, or bullying comments, and so on. A WebFoundation report
      discusses, for example the problem of Online Gender-based
      Voilence. So any new systems we design should involve
      investigation and modeling of these things, and where it is
      possible, explicit design to avoid harm. This is beyond level
      of this article.</p>
      <h2>Sets of Feeds</h2>If you want to use, in the Solid
      tradition, many different feed-consuming apps with same set
      of subscriptions, then you need interop at that level: the
      set of feeds I subscribe to. The NetNewsWire app will export
      is list of subscriptions in OPML like:
      <pre>      &lt;?xml version="1.0" encoding="UTF-8"?&gt;
      &lt;!-- OPML generated by NetNewsWire --&gt;
      &lt;opml version="1.1"&gt;
        &lt;head&gt;
                &lt;title&gt;Subscriptions-OnMyMac.opml&lt;/title&gt;
        &lt;/head&gt;
      &lt;body&gt;
      &lt;outline text="ongoing by Tim Bray"
          title="ongoing by Tim Bray"
          description="" type="rss"
          version="RSS"
          htmlUrl=""
          xmlUrl="http://www.tbray.org/ongoing/ongoing.atom"/&gt;
        &lt;/outline&gt;
        &lt;/body&gt;
      &lt;/opml&gt;
    </pre>
      <p>Relatively recently --- "NetNewsWire 6 for iOS — which
      includes new features iCloud syncing, home screen widgets,
      and a bunch more — was submitted to App Store review today.
      The team is super-psyched to get this release shipping!"</p>
      <h3>Self-hosted feed set sync</h3>
      <p>NetNewsWire uses a number of protocols to sync your set of
      RSS feeds across devices</p><img style="width:50%;" src="diagrams/feeds/nnw-account-type.png">
      <p><a target="refs" href="#FreshRSS">FreshRSS</a> looks interesting</p>
      <blockquote>
        FreshRSS can manage +100k articles without complaining.
        FreshRSS works on mobile Read your RSS feeds on your mobile
        without requiring any third-party application. Self
        host-able: Your data is yours! Host your aggregator and do
        not depend on anyone.
      </blockquote>
      <h2>Discovering feeds</h2>
      <p>You hear often "Get this from X or Y or wherever you get
      your podcasts". There is an assumption that the various
      platforms have more or less organized a list of all the
      podcasts by name. This works maybe for the famous ones but
      clearly doesn't scale is we encourage everyone to have a
      feed.</p>
      <p>You can find feeds and subscribe to them by following
      links. Links from other posts in other feeds, links send in
      email, links in the minutes of a meeting, and so on. The link
      you get may be to the feed itself, or it may be to a
      particular post, in which case you can find the feed.</p>
      <p>The global system need to provide both serendipity, when
      you come across new resources as a surprising while not
      looking for them, but also the ability to track down feed
      which you have heard about or you imagine might exist. The
      algorithms we are talking about here are those which in a
      Solid world, will determine which information people come
      across. In a present world with much misinformation and much
      disinformation, they are important.</p>
      <h2>Protocols vs Apps</h2>
      <p><i>"This podcast was generated by Audm. Get the Audm app
      to follow this, and other things from ..."</i> No, don't get
      the audm App, or whatever App they have made for each
      podcast. Use your generic podcast reader, like Overcast, or
      the default one on your device. Keep the RSS protocol alive!
      Allow yourself to customize the way you listen to things, and
      keep track of what you have read or listened to in one
      place.</p>
      <p>This is an important issue with feeds. If the feed
      providers have made a web site or an app with richer
      experience than he feed, would it not be great to be able to
      slip between the different versions while keeping track of
      what you have read, which place you had read it?</p>
      <h2>Synchronized feeds</h2>
      <p>A few years ago I read <a target="refs" href="https://nealstphenson.com/">Neal Stephensons's</a> novel
      <i>Termination Shock</i> on both audio and written text
      versions at once, skipping between the formats quite easily.
      That is really the way we should be consuming media in many
      cases: switching from sound to text when another person is
      around, back to sound when we are driving, and so on. it is
      about efficiency, and about accessibility.
    </p>
    <p>Presumably the way to do it in RSS would be to have 
      more than one "attachment" to the feed post, and then having some sort of mapping between 
      characters in the text and time in the audio (or video).
    </p>

      <p>[2024-12] I recently found an app <a target="refs" href="https://speechify.com/">Speechify</a> which allows me to import
      text, googleDocs, etc etc, read-listen to it smoothly.   Just what the doctor ordered.
      Keeps track of what I am reading and where I am across devices.
      Wishlist, of course: a version which will
    store my data about where I am in my pod instead of in the Speechify cloud.</p>

    <h2>Keep it open - 2024</h2>
    <p>This was first written in 2021. In 2024, there are different threats to the open podcast world.
    Spotify wants desperately to be my podcast player -- annoying me
    with suggesting podcasts when I wam using it for music.
    But I use Overcast, an independent app, for my podcasts, thank you.
    
    And now, I gather Spotify and Apple Podcasts, while playing anyones feed,
    also have some proprietary non-interoperable feeds you can only get on Apple Podcast or Spotify.
    A good policy is not to listen to the ones which don't use the standard.

      </p><h2>Feeds and Feedback</h2>
      
      <p>Originally that was the title of this post, as I was
      thinking of talking about protocols like Linkback and Pingback
      to record when a link is made to a post --typically when
      somebody makes a reference to it in their own blog. But that
      will have to wait for another time.</p>
      <hr>
      <h2 id="updates">Updates</h2>

      <dl>
        <dt>MNOT</dt><dd>Mark Nottingham, co-author of Atom, discussed in his 25 August 2024 blog, <a href="https://www.mnot.net/blog/2024/08/25/feeds">What RSS Needs</a> 
        discusses a new take on the world of feeds.</dd>
      </dl>
      <h2 id="references">References</h2>
      <ul>
        <li>
          <a target="refs" href="https://support.google.com/podcast-publishers/answer/9476656">
          Google on how Podcasts work and are listed</a>
        </li>
        <li>
          <a name="FreshRSS" target="refs" href="https://www.freshrss.org/" id="FreshRSS">FreshRSS - open source software (and protocol)
          to save RSS feed sets</a>
        </li>
        <li>GreenGeeks Web Hosting, <a target="refs" href="https://www.greengeeks.com/tutorials/what-are-trackbacks-and-pingbacks-in-wordpress/#:~:text=A%20pingback%20is%20a%20type,you've%20linked%20to%20them.">
          What are Trackbacks and Pingbacks in WordPress?</a>
        </li>
        <li>Web Foundation · November 25, 2020 <a target="refs" href="=&quot;https://webfoundation.org/2020/11/the-impact-of-online-gender-based-violence-on-women-in-public-life/&quot;">
          The impact of online gender-based violence on women in
          public life</a>
        </li>
      </ul>
    </div>
    <p><a href="Overview.html">Up to Design Issues</a></p>
    <p><a href="../People/Berners-Lee">Tim BL</a></p>
  </div>


]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 19 Dec 1997 00:00:00 GMT</pubDate>
  <title>Filtering and censorship</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Filtering.html</link>
    <guid>https://www.w3.org/DesignIssues/Filtering.html</guid>
      <description><![CDATA[
  <h1>Filtering and Censorship<i><br></i></h1>
  <p>Information is powerful stuff. &nbsp;The world has been
  enthralled by the power which the Web, the universe of
  accessible&nbsp;information, gives to people and to groups
  working and playing together.</p>
  <p>Information about information is powerful not just as
  information, but because it allows one to leverage one's use of
  information, to benefit from that which is relevant, accurate,
  stylish, unbiased, or timely, -- whatever one regards as being
  of&nbsp;"quality" -- without being enmired in that which is
  not.</p>
  <p>Powerful tools are often usable for constructive or
  destructive purposes just as paper and ink be used for truth or
  lies, and metal for ploughshares or swords. The Web's power stems
  from its universality - for example that a hypertext link an
  point to any information out there, not just a subset.
  &nbsp;People have asked whether I regret that the Web has been
  used for some uses, but I have to reply that if somehow it had
  been built to control the material which was placed in it, then
  that would be the technology controlling society, rather than the
  other way around as it should be.</p>
  <p>True, one could take the view that our society is not strong
  enough to be trusted with a powerful information system.
  &nbsp;One could take the view that society does not currently
  have the wherewithal to prevent the Web from being abused by
  destructive forces to an extent that the overall pain is greater
  than the gain. &nbsp;I do not believe this is true. In the
  western developed world, at least, I believe that the democratic
  process will have sufficient control over governments and the
  judicial process sufficient control of criminals, to continue to
  defend the health of the evolving society.</p>
  <p>We should be very careful, by constant inspection, to ensure
  that this continues to be the case.</p>
  <h3>Filtering and Censorship</h3>
  <p>One of the threats which posed itself in 1994 was of
  government censorship over information on the Web. &nbsp;In
  general, there are information acts which societies regard as
  legal, and those which are illegal (such as fraud). &nbsp;The
  problem which arose was that in the very subjective question of
  what information is deemed suitable for children, there was a
  threat that, in order to "protect" children, seeing no other
  alternative, governments were contemplating making draconian
  legislation for example prohibiting the transmission of
  "indecent" material. The problems here were many.</p>
  <p>First of all, the concept of "indecent" was being enforced as
  a central single concept, quite against the distributed
  subjective nature of its definition in society. &nbsp;The Web
  works as a decentralized system, with no hierarchical or other
  structure to force society into a shape imposed by technology.
  &nbsp;This works. &nbsp;Centralization of such an idea would
  [prevent the Web from being an accurate mirror of society
  itself.</p>
  <p>Secondly, the problem being solved was the reading of such
  information by children, not its transmission. Thirdly, the
  question of "transmission" seemed to include intermediate parties
  who were not responsible for the content in an editorial or
  authorship sense. And one could list other problems, but this is
  enough for the present.</p>
  <h3>Information Quality</h3>
  <p>The basic problem being addressed was that of subjective
  information "quality". &nbsp;This is the same problem reported by
  newcomers to teh web who find (typically after a search engine
  search) too much "junk".</p>
  <p>It is unreasonable to ask for information delivered from the
  web to be of consistently high "quality" if you can't define what
  "quality" is. &nbsp;There is a need, then, to be able to
  represent "quality" in a completely subjective way.</p>
  <p>This is what the PICS project was all about. &nbsp;PICS was
  specifically aimed at demonstrating that individuals could obtain
  their own subjective notion of quality without the government
  having to try to "protect" them by enforcing some centralized
  notion. &nbsp;Politically, PICS is a system necessary for the
  preservation of free speech on the Internet.</p>
  <p>The system needed a few different sorts of documents</p>
  <dl>
    <dt>a "rating system"</dt>
    <dd>which defines a scale or scales on which one might judge a
    document. &nbsp;The fact that anyone can create one of these is
    a strong force allowing decentralization of concept, breaking
    the problem of the global, centralized definition of for
    example "indecency". &nbsp;PICS allows communities of any size
    (from one up) to establish their own criteria. &nbsp;Agreement
    over a large community enhances global harmony, but threatens
    diversity. Agreement over a small community does the reverse.
    So in fact some balance is necessary</dd>
    <dt>a "label"</dt>
    <dd>which is a statement about something in terms of the
    schema. &nbsp;This can be made by any party, not just author or
    reader, and certainly not just central government. &nbsp;These
    can be created and exchanged in all manner of ways, so the PICS
    standard for interoperability is essential.</dd>
    <dt>A "profile"</dt>
    <dd>which describes for a given person the particular rating
    systems and levels on those scales which represent "quality"
    information at a given time and in a given context. &nbsp;This
    sort of information can either be input by a person using a
    graphic interface (such as a set of sliders in a dialog box),
    or can simply be transferred from someone they trust, whether
    family, organization, or friend. Inability to transfer this
    would prevent people from building their own communities with
    common standards of trust: hence the importance for this
    (picsrules) as a standard.</dd>
  </dl>
  <p>These are all subsets of a general metadata language, designed
  to be easy for people to use. &nbsp; In particular, by being
  limited in their power, they allow graphic interfaces to be
  built.</p>
  <h3>On social responsibility of technologists</h3>
  <p>The argument has been made that PICS technology should be
  suppressed as the power it gives may be abused by governments.
  &nbsp;(There are even those who have suggested that the whole
  scheme is a government inspired plot to promote censorship and
  limit free speech. &nbsp;This is certainly not the case, as
  neither in the idea, &nbsp;the funding nor the intent.)
  &nbsp;Whereas most readers may find this far fetched, it is worth
  a response on principle.</p>
  <p>As I pointed out when closing the first International World
  Wide Web conference, speaking to (then a mere 350) geeky web
  enthusiasts, I firmly believe it is the task of scientists and
  technologists to be aware of and responsible for the social
  implications of their work. &nbsp;This cannot just be left to
  "professional socially responsible people", as each engineer and
  scientist is often &nbsp;best aware of the potential of the work.
  &nbsp;Uttered in the auditorium at CERN, whose particle
  physicists trace their roots through nuclear physics, I don't
  think the message went unheard, even though it may have sounded
  strange in such a new field. &nbsp;Now, (1997) the World Wide Web
  Consortium has one of its three domains dedicated to the
  relationship between Technology and Society.</p>
  <h3>So what about PICS?</h3>
  <p>The question basically is whether the potential danger of the
  technology outweighs the freedom and positive good it accords.
  &nbsp;You can certainly argue this for nuclear fission, and you
  can certainly argue it for the wide distribution of firearms in
  populous countries. Can you argue this for PICS and metadata?
  &nbsp;Is there anything about PICS specifically or metadata in
  general which makes it more of a danger than a boon?</p>
  <p>The specific types of document in PICS are very general. As a
  system, it is quite generalist, and extremely decentralized.
  &nbsp;It does as good a job as it can of leaving policy up to
  others to set, although it does (compared with other systems one
  could imagine) tend to favor by its nature cultural diversity,
  and freedom of speech, including freedom to endorse other's
  work.</p>
  <p>&nbsp;The specifications of communication protocols enable,
  but do not enforce,&nbsp;what manufactured software will or will
  not be able to do. &nbsp;One cannot, therefore, at this level say
  what individuals will be able to do. &nbsp;The technology can
  leave the policy up to others, which leaves other groups to
  ensure that the values which they hold dear are not lost in new
  legislation, industry practices, or public apathy.</p>
  <p>A metainformation system allows one to talk about information.
  It enables all kinds of uses of information</p>
  <ul>
    <li>finding information</li>
    <li>talking about information</li>
    <li>making laws about information</li>
    <li>breaking laws about information</li>
  </ul>
  <p>It is not the place of a technical metadata system to try to
  limit the statements one can make with metadata, or the laws if
  any which are made. &nbsp;That is the role of the democratic
  process and whatever government the people trusts. The W3C as an
  industry consortium can act for industry in promoting standards,
  but cannot act to create laws. &nbsp;What we can do is explain to
  lawmakers and others the effect and intention of technology.
  &nbsp;That is what this article attempts to do.</p>
  <h4>Conclusion</h4>
  <p>So Metadata, PICS and otherwise, is powerful, as is
  information in general. &nbsp;Constant vigilance by concerned
  members of the public, industry and government is a very
  important part of the system of controls which keeps society
  healthy. &nbsp;The PICS technology was created
  specifically&nbsp;in order reduce the risk of government
  censorship in civilized countries. It was the result of members
  of the industrial community being concerned about the behaviour
  of government. The indications are that in this it will succeed,
  &nbsp;but that does not remove the need for such vigilance.
  &nbsp;</p>
  <p>To conclude, out of fear or ignorance, that PICS is more of a
  danger than it is a boon would be throw the baby out with the
  bathwater. &nbsp;Metadata is not just a new tool, it is the start
  of a machine-understandable web (a "web phase 2") of information
  whose impact should be as empowering &nbsp;to humanity as the
  human-understandable web of today. &nbsp;We must understand it as
  we build it.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 26 Apr 2019 00:00:00 GMT</pubDate>
  <title>Linked Data Shapes, Forms and Footprints</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Footprints.html</link>
    <guid>https://www.w3.org/DesignIssues/Footprints.html</guid>
      <description><![CDATA[
    <p>In a world of linked data, in which anyone can say anything
    about anything, how do we build systems in which users and apps
    are easily allowed to express useful, helpful things? What
    tools can we use which allow new systems to grow easily and
    work well together?</p>
    <h3 id="ontology-languages">Ontology languages</h3>
    <p>The RDF schema languages, RDF Schema and OWL, tell you
    implications one can draw from RDF Model data. They also tell
    you what things do not make logical sense. Therefore in a sense
    they indirectly have the function of constraining what RDF data
    one can write, though just by telling what would be nonsense
    (false). So they can in a rather weak way be used to guide a
    user interface. But that won't do what we need.</p>
    <p>Other schema systems, like that of <a href="https://schema.org/">schema.org</a>, give suggestions as to
    what predicates can be used to talk about objects of a given
    class. That is useful, but still is not enough.</p>
    <p>In this document, we will discuss three kinds of
    technologies to help with building apps on top of data:</p>
    <ol>
      <li>
        <a href="#shapes">Shapes</a> explain to machines what data
        should look like, independently of how that data is
        displayed to a user.
      </li>
      <li>
        <a href="#forms">Forms</a> are a user interface allowing
        people to read and write data in a specific shape.
      </li>
      <li>
        <a href="#footprints">Footprints</a> explain to machines
        where new data should be stored.
      </li>
    </ol>
  
<p><a href="https://www.w3.org/DesignIssues/Footprints.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 31 Dec 1998 00:00:00 GMT</pubDate>
  <title>Fractal web, fractal society</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Fractal.html</link>
    <guid>https://www.w3.org/DesignIssues/Fractal.html</guid>
      <description><![CDATA[
  <h1>The Scale-free nature of the Web</h1>
  <p>This article was originally entitled "The Fractal nature of
  the web". Since then, i have been assured that while many people
  seem to use <em>fractal</em> to refer to a Zipf (1/f)
  distribution, it should really only be used in spaces of finite
  dimension, like the two-dimensional planes of MandelBrot sets.
  The correct term for the Web, then, is <em>scale-free</em>.</p>
  <p>This isn't an observation so much as a requirement.</p>
  <p>I have <a href="#Berners-Lee">discussed elsewhere</a> how we
  must avoid the two opposite social deaths of a global monoculture
  and a set of isolated cults, and how the fractal patterns found
  in nature seem to present themselves as a good compromise. It
  seems that the compromise between stability and diversity is
  served by there the same amount of structure at all scales. I
  have no mathematical theory to demonstrate that this is an
  optimization of some metric for the resilience of society and its
  effectiveness as an organism, nor have I even that metric. (Mail
  me if you do!)</p>
  <p>However, it seems from experience that groups are stable when
  they have a set of peers, when they have a substructure. Neither
  the set of peers nor the substructure must involve huge numbers,
  as groups cannot "scale", that is, work effectively with a very
  large number of liaisons with peers, or when composed as a set of
  a very large number of parts. If this is the case then by
  induction there must be a continuum of group sizes from the vary
  largest to the very smallest.</p>
  <p>This seems to be a general rule which can guide our design,
  and against which we can measure actual patterns of use.</p>
  <p>It is in fact another aspect of the tension between many
  languages and one global language. Locally defined languages are
  easy to create, needing local consensus about meaning: only a
  limited number of people have to share a mental pattern of
  relationships which define the meaning. However, global languages
  are so much more effective at communication, reaching the parts
  that local languages cannot. This tension is exemplified in the
  standards process, when ideas have to be exposed to successively
  larger and larger groups, with friction and hard work at each
  stage.</p>
  <p>Other interesting things to model passing though a fractal
  system include DNA traits in intermarrying populations Someone
  suggested (who?) that the invention of the bicycle made a great
  difference to average health in the Welsh valleys because it
  allowed greater intermarrying and so increased the effective gene
  pool size Clearly, global travel could end up reducing the
  diversity. viruses propagating through schools and traveling
  business people; and problems propagating to someone who has a
  solution are more good exercises (State your assumptions!).</p>
  <h3>Zipf happens</h3>
  <p>Whether we like it or not, early measurements of web traffic
  by the DEC WRL firewall showed DEC employees browsing sites with
  a Zipf (1/n) distribution of popularity. (Anyone got any other
  measurements? [Neilsen 1997]). Recent analyses suggest the Web
  becoming smaller for its size seem to use.</p>
  <p>How can we use knowledge of the Web's fractal nature? By
  planning network bandwidth between long-range and short-range
  communication, planning for cache usage, etc. The physical
  network can be expected to have a variety of scale
  geographically, like the road system. However, the structure of
  the Web is interestingly different because of the lack of
  two-dimensional constraint. The challenge is to use this
  flexibility in building an effective society on top of the
  Web.</p>
  <h3>Looking for a metric</h3>
  <p>What do we mean by "effective"? We mean we would like to
  combine scientist's creative ability and knowledge to find a cure
  for AIDS. We would like to preserve world peace by allowing
  xenophobia to disperse in a web of understanding, while at the
  same time preserving the diversity of culture which gives the
  human race its richness. These are of course the same classic
  problems of the management of a large organization, of combining
  individual creativity with corporate vision.</p>
  <p>If the web of society has an imbalance, we pay for it. We pay
  for insufficient global understanding with war. We pay for
  insufficient family communication with broken families and
  unsupported individuals. At any level of scale, missing social
  structure at that scale will prevent problems at that scale being
  addressed, and also prevent resources at that scale being used.
  It would therefore be great to have a way of measuring for a
  given web the degree to which it has a balanced fractal pattern,
  and if not where its weaknesses are.</p>
  <p>Those looking for the "small world" effect chose metrics such
  as the maximum or mean value of the shortest path between any two
  points. This gives us a metric for effectiveness at the global
  scale, but not of the chewiness.</p>
  <p>Clustering algorithms can produce globs of various sizes, and
  a measure of the chewiness of a web may be that the cluster sizes
  have a Zipf distribution. For example, using Jon Kleinberg's
  algorithm (which for a link matrix A associates concepts with the
  eigenvectors of A*A), the strength of the cluster is the value of
  the eigenvalue, and (while this does not directly indicate size)
  an interesting test would be on the relative absolute values
  (squares?) of successive eigenvalues.</p>
  <p>Looking it at from the point of view of an individual (a graph
  node), an interesting question is the proportion of the traffic
  which is to local or more distant nodes. In Marchiori's model
  [<a href="#marchiori">Marchiori</a>] traffic flows between two
  nodes in inverse proportion to the resistance of the shortest
  path. The total "efficiency" is deemed to be the total flow
  between all pairs of nodes. Can we measure a "chewiness" which
  measures the approximation of the system to a fractal
  distribution of long and short range communication? If the
  Marchiori model were modified to use parallel conductance (more
  like a real signal flow system) then would this be simpler?</p>
  <p>Suppose for example we look at the amount of connection we
  have with nodes whose distance, or groups whose size, is of each
  order of magnitude and look for smoothness up to the global
  level.</p>
  <h3>Stop Press</h3>
  <p><em>2000/03</em></p>
  <p>Well, here I was thinking that while it is intuitively clear
  that society has to be fractal, I didn't know a mathematical
  justification for it, when <a href="http://www.cs.cornell.edu/home/kleinber/kleinber.html">Jon
  Kleinberg</a> comes up with what for me is his second cool web
  result.</p>
  <p>This is a paper takes the case of a two-dimensional grid. It
  imagines each cell having a certain distribution of links of
  various lengths. It demonstrates that in order to achieve the
  connectivity a la <em>6 degrees of separation</em> which scales
  with the log of the size of the system, then the distribution of
  link density as a function of distance must be precisely an
  inverse-square law. That is, each cell must have the same number
  of links (on average) to cells 1-10 squares away as to cells
  10-100 away, etc. Anything more local or more global leads to
  less of a small-world phenomenon: this is the only scalable
  solution.</p>
  <p>True, this applies to a geographical grid, and a square rather
  uniform one at that. However, He does generalize it to more
  dimensions. Furthermore, you can see logically how the system
  works. To get a postcard to an arbitrary person in Massachusetts
  through a network of friends, you must have enough local friends
  to be able to find someone who will know someone in
  Massachusetts. The person they find in Massachusetts must be able
  to pass it to people successively closer and closer to the
  target. this only works if there is connectivity on each scale.
  True, no one has derived the metric of the number of hops a
  message takes as being an essential metric for systems, but on
  the other hand there is a clear analogy with the number of hops
  between a problem and a solution in a large organization .</p>
  <p>Other work:</p>
  <ul>
    <li>
      <a href="http://dmag.upf.es/livingsw">Living semantic web</a>
    </li>
  </ul>
  <h3>Data from Swoogle April 2005</h3><img style="width: 500px; height: 400px; float: right;" alt="Yes, zipf dist from Swoogle" src="diagrams/swoogle/figure6-2005-04.png"><br>
  Nice to see some Zipf-shaped curves.  Swoogle <a href="http://swoogle.umbc.edu/modules.php?name=Swoogle_Statistics&amp;file=figure&amp;figurename=figure6">
  notes</a>:
  <ul>
    <li>All these series follows Zipf's distribution, except the
    tail</li>
    <li>The sharp decrease the tail in "class populated" shows that
    the most populated classes highly correlated such that their
    are populated by almost the same amount of SWDs. Similar
    situation can be observed in other series.</li>
    <li>The closeness of the sharp decrease of "class populated"
    and "property populated" is caused by the co-existence of
    certain classes and certain properties.</li>
  </ul>
  <h2 id="personal">Postscript - A personal exercise</h2>
  <p>There will I am sure be a lot of ways in which the fractal
  requirement is used in web design. You can also use it in that
  task of figuring out how you fit in to society at large (and at
  small). Do your personal interactions spread across the scales?
  Here is a self-help chart to help think about this. You fill in
  the groups in your life.</p>
  <table border="1">
    <tbody>
      <tr>
        <th>Scale</th>
        <td>1</td>
        <td>10</td>
        <td>1000</td>
        <td>10k</td>
        <td>100k</td>
        <td>1M</td>
        <td>10M</td>
        <td>100M</td>
        <td>1G</td>
      </tr>
      <tr>
        <th>Group</th>
        <td>You</td>
        <td>
          family,
          <p>group</p>
        </td>
        <td>...</td>
        <td>...</td>
        <td>town?</td>
        <td>city?</td>
        <td>country?</td>
        <td>USA</td>
        <td>World population</td>
      </tr>
      <tr>
        <th>Time spent</th>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
      </tr>
      <tr>
        <th>Money spent</th>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
      </tr>
      <tr>
        <th>etc</th>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
        <td>?</td>
      </tr>
    </tbody>
  </table>
  <p>Another way to do this is find 11 jars, and label one with
  each scale in powers of 10. (You don't have to paint them but it
  helps).</p>
  <p><img src="diagrams/jars.png" alt="11 jars from 1 to 1G"></p>
  <p>Put marbles in each can for each time period you spend on
  matters at a given scale, such as an international meeting, or a
  school sportsfield, or with your family, or alone in a treehouse.
  How well balanced do the jars become?</p>
  <p>As a social person, do you spend enough time with groups of
  each size? If not, are there people one click from you who do,
  and through whom you are indirectly present in those groups? One
  of the concerns is that the last column - the global column -
  tends in my observation to get the smallest amount money at
  least, as in the US federal and state and town taxes are spread
  around the other areas but the level of international aid is very
  much lower. The cool thing is that I think people are born with
  DNA which gives them a healthy interest at all these levels.
  People who stick at one scale all their lives feel very
  uncomfortable. Maybe our preferences have evolved to form
  naturally a fractal society.</p>
  <h3><a name="tco" id="tco">Total Cost of Ontologies
  (2005)</a></h3>(I can't remember where I originally brought this
  up, I think at the Web Science workshop in London 2005/9. This is
  from ISWC 2005 slides.)
  <p>One of the interesting things about assuming a fractal
  distribution is you can think about the number of ontologies an
  the time it takes to make them, and the total cost of using
  ontologies. So let us for example naivel assume that<br>
  ontologies are evenly spread across orders of magnitude; committe
  &nbsp;size goes&nbsp; as log(community),&nbsp;time as comitee^2,
  cost is shared across community.<br></p>
  <table style="text-align: left; width: 100%;" border="1" cellpadding="2" cellspacing="2">
    <tbody>
      <tr>
        <td>Scale</td>
        <td>Eg</td>
        <td>Committe size</td>
        <td>Cost per ontology (weeks)</td>
        <td>Cost for me</td>
      </tr>
      <tr>
        <td>0</td>
        <td>Me</td>
        <td>1</td>
        <td>1</td>
        <td>1.000000</td>
      </tr>
      <tr>
        <td>10</td>
        <td>My team</td>
        <td>4</td>
        <td>16</td>
        <td>1.600000</td>
      </tr>
      <tr>
        <td>100</td>
        <td>Group</td>
        <td>7</td>
        <td>49</td>
        <td>0.490000</td>
      </tr>
      <tr>
        <td>1000</td>
        <td></td>
        <td>10</td>
        <td>100</td>
        <td>0.100000</td>
      </tr>
      <tr>
        <td>10k</td>
        <td>Enterprise</td>
        <td>13</td>
        <td>169</td>
        <td>0.016900</td>
      </tr>
      <tr>
        <td>100k</td>
        <td>Business area</td>
        <td>16</td>
        <td>256</td>
        <td>0.002560</td>
      </tr>
      <tr>
        <td>1M</td>
        <td></td>
        <td>19</td>
        <td>361</td>
        <td>0.000361</td>
      </tr>
      <tr>
        <td>10M</td>
        <td></td>
        <td>22</td>
        <td>484</td>
        <td>0.000048</td>
      </tr>
      <tr>
        <td>100M</td>
        <td>National, State</td>
        <td>25</td>
        <td>625</td>
        <td>0.000006</td>
      </tr>
      <tr>
        <td>1G</td>
        <td>EU, US</td>
        <td>28</td>
        <td>784</td>
        <td>0.000001</td>
      </tr>
      <tr>
        <td>10G</td>
        <td>Planet</td>
        <td>31</td>
        <td>961</td>
        <td>0.000000</td>
      </tr>
    </tbody>
  </table><br>
  Total cost of 10 ontologies: 3.2 weeks. Serious project: 30
  ontologies, TCO = 10 weeks.<br>
  Lesson: <span style="font-weight: bold;">Do your bit. Others will
  do theirs.</span><br>
  Thank those who do working groups.
  <h3><a name="exp" id="exp">Q: How can the semantic web
  work...</a></h3>
  <p><em>... when we are all in one big domain of discourse but
  people are all making their own local ontologies?</em>
  (2007/3/3)</p>
  <p>Rather than 'domain of discourse' , or set of things
  considered, I think of 'community', set of agents communicating
  using certain terms. When one thinks in terms of domain of
  discourse, one tends to conclude that everyone who talk at all
  about a car (say) has cars in their domain of discourse and so
  everyone must share the model which includes the single class
  Car.</p>
  <p>It isn't like that though. An agent plays a role in many
  different overlapping communities. When I tag a photo as being of
  my car, or I agree to use my car in a car pool, or when I
  register the car with the Registry of Motor Vehicles, I probably
  use different ontologies. There is some finite effort it would
  take to integrate the ontologies, to establish some OWL (or
  rules, etc) to link them.</p>
  <ul>
    <li>Everyone is encouraged to reuse other people's classes and
    properties to the greatest extent they can.</li>
    <li>Some ontologies will already exist and by publicly shred by
    many, such as ical:dtstart, geo:longitude, etc. This is the
    single global community.</li>
    <li>Some ontologies will be established by smaller communities
    of many sizes.</li>
  </ul>
  <p>Why do I think the structure should be will be fractal?
  Clearly there will be many more small communities, local
  ontologies, than global ones. Why a 1/f distribution? Well, it
  seems to occur in many systems including the web, and may be
  optimal for some problems. That we should design for a fractal
  distribution of ontologies is a hunch. But it does solve the
  issue you raise. Some aspects of the web have been shown to be
  fractal already.</p>Here are some properties of the
  interconnections:
  <ul>
    <li>- The connections between the ontologies may be made after
    their creation, not necessarily involving the original ontology
    designers.</li>
    <li>- There is a cost of connecting ontologies, figuring out
    how they connect, which people will pay when and only when they
    need the benefit of extra interoperability.</li>
    <li>- Sometimes when connecting ontologies, it is so awkward
    there is pressure to change the terms that one community uses
    to fit in better with the other community. Again, a finite cost
    to make the change, against a benefit or more interop.</li>
  </ul>
  <p>Yes, if web-based means an overlapping set of many ontologies
  in a fractal distribution. In his fractal tangle, there wil be
  several recurring patterns at different scales. One pattern is a
  local integration within (say) an enterprise, which starts
  point-point (problems scale as n^2) and then shifts with EIA to a
  hub-and-spoke as you say, where the effort scales as N. Then the
  hub is converted to use RDF, and that means the hub then plugs
  into a external bus, as it connects to shared ontologies.</p>
  <p>So the idea is that in any one message, some of the terms will
  be from a global ontology, some from subdomains. The amount of
  data which can be reused by another agent will depend on how many
  communities they have in common, how many ontologies they
  share.</p>
  <p>In other words, one global ontology is not a solution to the
  problem, and a local subdomain is not a solution either. But if
  each agent has uses a mix of a few ontologies of different scale,
  that is forms a global solution to the problem.</p>
  <h2>Conjecture</h2>
  <p>The conjecture is that there is some model which reasonably
  well described these systems, and that given that model one can
  show that the scale-free distribution of communities is
  optimal.</p>
  <p>There are many other questions. Of course existing systems on
  the earth may be very much influenced by the geographical reality
  of a two-dimensional surface. Historical groups have been nested
  geographically. So though there may be aspects in which community
  size is scale-free, that maybe a completely different
  optimisation problem from the one we have when on the Internet
  anyone can connect to anyone. If you could devise an algorithm
  for connecting people into groups, and so that they each
  participated in communities of different sizes in a scale-free
  way, then how much more effective (at solving problems, etc) can
  you make a web-based society which ignores geographical borders?
  To what extent does humanity as currently connected by the web in
  fact deviate from geographical nesting anyway?</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 08 Apr 1997 00:00:00 GMT</pubDate>
  <title>Fragment identifiers</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Fragment.html</link>
    <guid>https://www.w3.org/DesignIssues/Fragment.html</guid>
      <description><![CDATA[
  <h1>URI References: Fragment Identifiers on URIs</h1>
  <p>The URI by itself is a powerful thing, but there is a more
  powerful concept which is the URI reference.</p>
  <p>The URI reference is a thing you build by taking a URI for an
  information object, adding a "#" sign and then a
  <strong>Fragement identifier</strong>. (The last term is
  historical, so try not to thinl of it necessarily identifying a
  fragment).</p>
  <p>The fragment identifier is a&nbsp;string after URI, after the
  hash, which identifies something specific as a function of the
  document. For a user interface Web document such as HTML poage,
  it typically identifies a part or view. For example in the
  object</p>
  <pre>         http://foo/bar#frag
 
</pre>
  <p>the string "frag" is the fragment identifier. It is badly
  named, as it can identify anything.</p>
  <p>(Depending on where you look, the URI is considered to include
  the fragment identifier, or to have the fragment identifier
  appended to it. &nbsp;This is important for the BNF, but in
  practice you will find people using the terms URI and URL loosely
  to things which do or do not include a possible fragment
  identifier. Formally, the URI <strong>does</strong> include the
  fragment ID)</p>
  <p>In practice, you can divide the processing which occurs when
  following a link using&nbsp;HTTP into three steps:</p>
  <ol>
    <li>The client figures out which server to contact by parsing
    part of the URL, and sends the URL as a request to the
    server;</li>
    <li>The server figures out which object is referred to by
    parsing the rest of the URL, and returns some rendition of it
    to the client;</li>
    <li>The client presents all or part of the object to the
    user</li>
  </ol>
  <p>The last part typically involves finding some software class
  which can handle the given MIME type, and passing it the data
  stream. &nbsp;At the same time, the fragment identifier is passed
  as a parameter to the created object.</p>
  <p>For HTML, the fragment ID is an SGML ID of an element within
  the HTML object. For XML, if it is just a word, then it is the
  XML ID of an element in the document.</p>
  <h4><a name="significance" id="significance">Axiom</a></h4>
  <p class="axiom">The significance of the fragment identifier is a
  function of the MIME type of the object</p>
  <p>This means that the fragment id is opaque for the rest of the
  client code. &nbsp;The HTTP engine cannot make any assumptions
  about it. &nbsp;The server is not even given it.</p>
  <p>It also means that for any new data type one can be creative
  about using the fragment ID in a relevant way. For example, for
  a&nbsp;3D object the fragment ID &nbsp;could give a viewport. For
  a music object, the Fragment ID could give a &nbsp;section in
  time, or a set of parts, or it could include a suggested tempo.
  &nbsp;For future versions of HTML, the fragment ID could be made
  more powerful to include a range or "ladder" reference to a part
  or parts of the SGML element tree by position. A very useful
  fragment ID for plain text would allow ranges to be quoted by
  line and character number</p>
  <p>These things are all decisions made when the MIME type is
  defined. &nbsp;Therefore,</p>
  <p class="axiom">The&nbsp;fragment ID spec for a&nbsp;new MIME
  type&nbsp;should &nbsp;be part of the MIME type registration
  process.</p>
  <p>Different MIME types then can have different fragment ID
  specifications. When HTTP for example negotiates between
  different content types, it is clearly useful for those types to
  have a consistent (hopefully identical) fragment ID syntax and
  semantics.</p>
  <h3 id="Fragment1"><a name="Fragment2" id="Fragment2">Fragment
  identifiers for RDF identify concepts</a></h3>
  <p>The semantic web has information about anything. The fragment
  identifier on an RDF (or N3) document identifies not a part of
  the document, but whatever thing, abstract or concrete, animate
  or innanimate, the document describes as having that
  identifier.</p>
  <p>It is important, on the Semantic Web, to be clear about what
  is identified. An <code>http:</code> URI (without fragment
  identifier) necessarily identifies a <a href="Generic.html">generic document</a>. This is because the HTTP
  server response about a URI can deleiver a rendition of (or
  location of, or apologies for) a document which is identified by
  the URI requested. A client which understands the http: protocol
  can immediately conclude that the fragementid-less URI is a
  generic document. This is true even if the publisher (owner of
  the DNS name) has decided not to run a server. Even if it just
  records the fact that the document is not available online, still
  a client knows it refers to a document. This means that
  identifiers for arbitrary RDF concepts should have fragment
  identifiers. This, in turn, means that RDF namespaces should end
  with "#".</p>
  <h3 id="Object">Object Names as fragment identifiers</h3>
  <p>When a document language (MIME type) has some form of
  intra-document naming for objects then it is intuitive is these
  names can be directly used as fragment identifiers. This is true
  for XML, that the XML ID which is used to identify elements can
  be directly used as a fragment identifier.</p>
  <h3><a name="Fragment" id="Fragment">Fragment IDs and Content
  negotiation - known bug</a></h3>
  <p>If content negotiation occurs across types which do NOT share
  a fragment ID specification, then rigidly there has been an
  error. In practice, HTML was the only type (in 1997) which
  allowed fragment IDs anyway, and other types ignore it. Also, as
  falling back from a pointer to a specific view to a pointer to
  the whole document has been considered effective fallback
  procedure, so no harm was done. Now (2001) it becomes more of a
  problem. there have been proposasl to add the requested fragment
  idntifier to the HTTP request to fix this.)</p>
  <p>In the future, metadata returned or warnings returned should
  indicate to the client that this could be a problem. Also, in new
  access protocols, the fragment ID requested could be shipped to
  the server as a hint, which would allow the server and client to
  negotiate and if successful arrange for the fragment ID to be
  converted to a suitable equivalent value for an alternative MIME
  type.</p>
  <h3><a name="User" id="User">User awareness of the form of a
  reference</a></h3>
  <p>Clearly when a fragment ID is generated and associated with a
  URI which is generic in any way (language, version, etc as well
  as content-type), then there is a possible failure of the
  fragment-id refers to something which is not defined in any
  specific instance. &nbsp;It would be appropriate for a client,
  when generating a link (or bookmark, etc) to provide the user
  with a choice of</p>
  <ul>
    <li>A bookmark to the whole living document, or</li>
    <li>A bookmark to a specific part of a "dead" version;</li>
    <li>Intermediate combinations.<br></li>
  </ul>
  <p>As both these options are meaningful and useful, they will
  have to surface at the user interface level.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Mar 1996 00:00:00 GMT</pubDate>
  <title>Generic resources</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Generic.html</link>
    <guid>https://www.w3.org/DesignIssues/Generic.html</guid>
      <description><![CDATA[
  See also:
  <ul>
    <li>
      <a href="../MarkUp/Resource/Specification">A proposal for an
      HTML "Resource" element</a>
    </li>
    <li>
      <a href="Formats.html">Historical web design note on
      formats</a>
    </li>
    <li>
      <a href="../Protocols/">HTTP overview by W3C</a>
    </li>
  </ul>
  <h1>Generic Resources</h1>
  <p>A URI represents a <b>resource</b></p>
  <p>A "resource" is a conceptual entity (a little like a Platonic
  ideal). When represented electronically, a resource may be of the
  kind which corresponds to only one posisble bit stream
  representation. An example is the text version of an Internet
  RFC. That never changes. It will always ha the same checksum.</p>
  <p>On the other hand, a resource may be <b>generic</b> in that as
  a concept it is well specified but not so specifically specified
  that it can only be represented by a single bit stream. In this
  case, other URIs may exist which identify a resource more
  specifically. These other URIs identify resources too, and there
  is a relationship of genericity between the generic and the
  relatively specific resource.</p>
  <p>As an example, successively specific resources might be</p>
  <ol>
    <li>The Bible</li>
    <li>The Bible, King James Version</li>
    <li>The Bible, KJV, in English</li>
    <li>A particular ASCII rendering of the KJV Bible in
    English</li>
  </ol>
  <p>Each resource may have a URI. The authority which allocates
  the URI is the authority which determines wo what it refers:
  Therefore, that authority determines to what extent that resource
  is generic or specific.</p>
  <p>This model is more of an observation of a requirement than an
  implementation decision. Multilevel gnericity clarly exists in
  all our current life with books and electronic documents.
  Adoption of this model simply follows from the rule that Web
  design should not arbitrarily seek to constrain life in general
  for its own purposes.</p>
  <h2>Dimensions of genericity</h2>
  <p>When we discuss electronic resources, an interesting fact is
  that a small number of dimensions of genericity emerge.</p>
  <table border="1" cellpadding="2">
    <tbody>
      <tr>
        <td>Time</td>
        <td>A resource may vary with time. For example, "The Wall
        Street Journal" varies with time. Each issue is a
        time-specific resource, which does not change with time.
        Most home pages on the Web change with time, in a less
        periodic way.</td>
      </tr>
      <tr>
        <td>Language</td>
        <td>When a document is translated, it is useful to be able
        to refer to it either in the generic, or to a particular
        specific translation.</td>
      </tr>
      <tr>
        <td>Content-Type</td>
        <td>A given resource may have mny ways in which it can be
        represented on the wire, using different
        <tt>Content-type</tt>s (in HTTP terms). As an example, an
        image may be represented in PNG or JFIF format.</td>
      </tr>
      <tr>
        <td>Target medium</td>
        <td>A given resource may be targetted specifically to a
        specific medium, such as a printer, being displayed on
        laptop screen, being displayed on a cellphone, or being
        projected onto a large screen for an audience. (This is
        currenltly available for selecting CSS stylesheets, but is
        not done at the HTTP content negotiation level)</td>
      </tr>
    </tbody>
  </table>
  <p>The fact that there are such a small number of dimensions
  currently apparent sugests that Web software should handle them
  individually in its interface with the user, even though the
  architecure should handle them as a single concpet.</p>
  <h2>Derivation</h2>
  <p>When a document is translated, one of the language-specific
  resources may have been the original source. However, this need
  not always be the case. Specific resources may have been derived
  from unrelated sources, or multiple sources. Therefore, though it
  is interesting to be able to describe the "derived-from"
  relationship, this is <em>not</em> part of the genericity
  relationship. It is not discused further here.</p>
  <h2>Genericity Metadata</h2>
  <p>When making statements about resources, genericity leads two
  types of statement. The examples use imaginary HTML elements or
  HTTP headers as illustrations of the meaning.</p>
  <h3><a name="Dimensions" id="Dimensions">Dimensions</a></h3>
  <p>A statement about the genericity of an object is important
  both for the user, and also for example for a cache manager. This
  statment takes the form of a list of dimensions in which the
  resource for a given URI is generic.</p>
  <p>One proposal was the <tt>vary</tt> field in the <tt>URI:</tt>
  header in HTTP:</p>
  <p><code>URI: http//foo.com/bar/baz vary=time,language</code>
  This is a statement about the relationship between the URI and
  the resource. (See also <a href="NameMyth.html#QoS">Quality of
  service of names</a>)</p>
  <h3>Relationships</h3>
  <p>The other statement which can be made is about a genericity
  relationship between two resources. Typed links provide this kind
  of statement. One proposal was</p>
  <pre> 
         &lt;link rel="language-specific" href="baz.fr"&gt;
 
 
</pre>
  <p>which means "This resource is a language specific version of
  this resource identified by baz.fr" This needs to be combined in
  with information about the particualar language.</p>
  <pre> 
         &lt;resource uri="baz.fr" vary="type, time"&gt;
                 &lt;meta htp-equiv="content-language" value="Fr"&gt;
         &lt;/resource&gt;
 
</pre>
  <p>So much for the architectural ideas. In practice one would use
  a shorthand form for all this information such as</p>
  <pre>         &lt;specific language="fr" uri="baz.fr"&gt;
 or
         &lt;specific language="fr" ct="text/html" uri="baz.fr.html"&gt;

</pre>
  <h2 id="Using">Using RDF to model this</h2>
  <p>There is now an RDF ontology for these concepts, <a href="http://www.w3.org/2006/gen/ont">http://www.w3.org/2006/gen/ont</a>.
  The ontology does not describe the target-medium dimension.
  (Please use that instead of the old one desribed here in
  2000-09.)</p>
<h2 id="UsingOld">Old ontology RDF to model this</h2>
<address>
  Added 2000/09 
</address>

<p></p>

<p>Now that the RDF metadata architecure is developed, we can model genericity
using a set of properties to represent these relationships. The natural way to
do this is to define classes for the one-parameter flags such as
time-invariant, language-invariant, etc and properties such as
isLangaugeSpecificVersionOf.</p>

<table border="1">
  <caption>Classes</caption>
  <tbody>
    <tr>
      <th>Class name</th>
      <th>Significance</th>
    </tr>
    <tr>
      <td>u:TimeInvariant</td>
      <td>The relationship between a representation of this resource and the
        URI will not change over time</td>
    </tr>
    <tr>
      <td>u:LanguageInvariant</td>
      <td>The relationship between a representation of this resource and the
        URI will not change no matter what language is requested.</td>
    </tr>
    <tr>
      <td>u:ContentTypeInvariant</td>
      <td>The relationship between a representation of this resource and the
        URI will not change s a function of content negotiation of MIME
      type</td>
    </tr>
    <tr>
      <td>u:Fixed</td>
      <td>The relationship between a representation of this resource and the
        URI will not change nder any circumstances</td>
    </tr>
  </tbody>
</table>

<p>u:Fixed is a subclass of each of the other three. P3P policies are supposed
to be in u:Fixed.</p>

<table border="1">
  <caption>Properties</caption>
  <tbody>
    <tr>
      <th>Property name</th>
      <th>Significance</th>
      <th>Domain</th>
      <th>Inverse property name</th>
    </tr>
    <tr>
      <td>u:isVersionOf</td>
      <td>A is one of the specific versions of a time-generic resource B</td>
      <td>u:TimeInvariant</td>
      <td>u:hasVersion</td>
    </tr>
    <tr>
      <td>u:isLanguageSpecficVersionOf</td>
      <td>A is one of the specific languages (in the sense of HTTP
        content-langauge) of a langauge-generic resource B</td>
      <td>u:LanguageInvariant</td>
      <td>u:hasLanguageSpecificVersion</td>
    </tr>
    <tr>
      <td>u:isContetntTypeSpecificOf</td>
      <td>A is one of the specific content-type-specific resources (in the sense of HTTP
        Content-type) of a generic resource B</td>
      <td>u:ContentTypeInvariant</td>
      <td>u:hasContentTypeSpecificResource</td>
    </tr>
  </tbody>
</table>

<p></p>

<p>There is no assurance when one of these properties is used that either
subject or object is not itself invariant.  In other words, if one states of
two identical TimeInvariant resources that one is a version of the other, that
is consistent. The promise that neither will change can be made later as a
consistent with an earlier promise that one will not change.</p>

  ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 27 Jul 2024 00:00:00 GMT</pubDate>
  <title>The Good Things on the Internet</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Good.html</link>
    <guid>https://www.w3.org/DesignIssues/Good.html</guid>
      <description><![CDATA[
    <p>There is a growing movement to 
      understand, fix, and mitigate the problems of the web, and
      specifically of social media.

      Parents of children and youth worry about
      the potential harm to their offspring from engaging in democracy,
      and schools and wonder whether to just ban phones for kids.
      There are a lot of important and good things
      on the web - which in fact
      come from the vast majority of the web sites and apps.
      We need to recognize that, make sure we and our children
      make the best use of it, while protecting ourselves from the harms.
    When you look at all of the things to do on the web, or
    in the apps, then the majority are actually not damaging, many 
   are in fact good - and many are actually wonderful.  
   There are the pre-web systems like email, podcast and blog readers, and chat.
   There are web platforms which are beneficent, including open source systems.
   There are systems built on top or the Solid Protocol, which naturally
   provide users with a power that we call digital sovereignty.

    </p>
  
<p><a href="https://www.w3.org/DesignIssues/Good.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Jun 2009 00:00:00 GMT</pubDate>
  <title>Putting Government Data on the Web</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/GovData.html</link>
    <guid>https://www.w3.org/DesignIssues/GovData.html</guid>
      <description><![CDATA[Government data is being put online to
  increase accountability, contribute valuable information about
  the world, and to enable government, the country, and the world
  to function more efficiently. All of these purposes are served by
  putting the information on the Web as Linked Data. Start with the
  "low-hanging fruit". Whatever else, the raw data should be made
  available as soon as possible. Preferably, it should be put up as
  Linked Data. As a third priority, it should be linked to other
  sources. As a lower priority, nice user interfaces should be made
  to it -- if interested communities outside government have not
  already done it. The Linked Data technology, unlike any other
  technology, allows any data communication to be composed of many
  mixed vocabularies. Each vocabulary is from a community, be it
  international, national, state or local; or specific to an
  industry sector. This optimizes the usual trade-off between the
  expense and difficulty of getting wide agreement, and the
  practicality of working in a smaller community. Effort toward
  interoperability can be spent where most needed, making the
  evolution with time smoother and more productive.
<p><a href="https://www.w3.org/DesignIssues/GovData.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 13 Mar 2022 00:00:00 GMT</pubDate>
  <title>The Intimacy Gradient</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Gradient.html</link>
    <guid>https://www.w3.org/DesignIssues/Gradient.html</guid>
      <description><![CDATA[
    <p>A city has public places where I can do all kinds of things,
    and also a private house with a private room which may be by
    myself. In that house there are spaces where I do things with
    family, friends, colleagues. The web must like a well-designed
    building, provide a gradient of intimacy between the private
    and the public, so I can easily recognize the difference,
    easily know which I am in, and easily welcome people to come
    into the more intimate areas. Our Solid tools should respect
    these ideas.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Gradient.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 19 May 2008 00:00:00 GMT</pubDate>
  <title>HTML and XML</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/HTML-XML.html</link>
    <guid>https://www.w3.org/DesignIssues/HTML-XML.html</guid>
      <description><![CDATA[
  <div class="cols">
    <h1>HTML and XML</h1>
    <address>
      W3C AC meeting, 2008-05-19
    </address>
    <p>The goal of this document is to investigate the possibility,
    over time, of healing the rift between the HTML5 and XML
    technologies, to achieve interoperability between software and
    markup which are currently on two sides of the fork.</p>
    <p>The method is is to try to understand the motivations of the
    various positions, and address those at source, and not to use
    them to decide that a particular fork is "right".</p>
    <p>The content of this essay is accumulated from many sources.
    It was given in large part as a talk to the May 2008 W3C
    Advisory Committee meeting, posing a series of questions about
    future directions for HTML. Discussion of this topic is
    directed to the W3C TAG list, www-tag@w3.org (<a href="http://lists.w3.org/Archives/Public/www-tag/">archive</a>)
    .</p>
    <h2>Introduction</h2>
    <p>The development of Web technology advances at different
    speeds on different fronts and different times. Occasionally it
    seems that some strategic thinking is necessary in order to
    ensure that the system as a whole will continue to work well
    and evolve smoothly. This is one of those times.</p>
    <h3>The fork</h3>
    <p>The purpose of this essay is not to detail the history, but
    let me start by summarizing quickly to set the context. HTML is
    the most widely deployed document format by a long way in the
    history of computing. XML, also, is very successful, being a
    framework for many formats public, private, in many different
    applications. As a simplification of the original SGML, on
    which HTML was based, XML allows code to be lighter and faster
    than SGML systems, and makes it easier for developers. We have
    seen in recent years a hiatus in the development of HTML,
    followed by a more recent surge along two branches. One branch
    of HTML, XHTML, which switched from using SGML to using XML,
    provided various new features, used the XML namespaces
    extensibility system, but was not widely deployed in the
    dominant Web browser, Internet Explorer. Another branch, HTML5,
    has been specified with the explicit goal of describing exactly
    the rather contorted behavior existing browsers implement to
    handle the legacy of Web pages found in practice on the Web, as
    well as introducing a different set of new features (video
    tags, etc). While it provides for an optional XML
    serialization, HTML5 does not in general use XML and
    specifically does not use XML namespaces. Below we unravel the
    separate criticisms of XML and XML namespaces.</p>
    <p>The existence of the fork is a serious problem, both because
    a fork in standards is fundamentally costly for the whole
    community going forward, and because of the technological
    problems which are highlighted in the issues which each branch
    has with the other branch.</p>
    <p><strong>Arguments for cleaning up</strong></p>
    <p>Now, there may be extreme versions of the HTML5-fork style
    which maintain that everything is fine, and that the mess is
    just life; we will have to live with liberal parsers forever,
    and that is the only realistic approach. However, not only is
    the code stack horrible to maintain, but pages that are not
    well formed are hard to maintain, process, and reuse.</p>
    <p>Also, there is a whole world of XML-based software in the
    enterprise, some of it SOAP-based services, some of it more
    document-oriented, whose developers could not imagine for a
    moment deviating from the XML path by allowing this sort of
    liberalness, as systems would just stop.</p>
    <p>Can we assume that the HTML Web and the XML enterprise
    systems will be non-interoperable worlds? Possibly, but with a
    constant cost, whenever attempts are made to move data from one
    to another, to embed some HTML product description into an
    order, for example.. The boundary will never be clear as in
    fact there is an overlap. Some suggest making a version of SVG
    which is in HTML5 (liberal) format, while others use XML
    engines to process SVG. People embed HTML in RSS and Atom feeds
    and RDF feeds (using RDF's XMLLiteral datatype) and RDF parsers
    don't have embedded HTML5 parsers, so it has to be well-formed.
    And so on.</p>
    <p>To continue to promote messy code on the Web is to create
    problems and pain later on. To promote clean XML is a current
    pain for real users which they will not put up with. How can we
    escape from this? To understand possible paths forward let us
    look at how the language is typically extended, on each
    fork.</p>
    <h2>Centralized HTML extensibility</h2>
    <p>The HTML community has not embraced URI-based extensibility.
    In fact, decentralized extensibility is not a general goal to
    many. This is not surprising. HTML is the most widely deployed
    data format in history by a long way. Every Web browser is
    expected to be able to handle it. Its evolution is a form of
    ongoing negotiation between users, Web developers, and browser
    developers (and their management). The HTML language itself has
    a unique place among other languages. The model of a large
    number of small overlapping communities, which was the target
    of the RDF design, does not apply to the HTML language.</p>
    <p>It is not surprising that requiring each HTML document to
    start with a namespace declaration irks those for whom the
    whole world is HTML. When everyone is deemed to know the HTML
    spec, why have it vectored to by the XHTML namespace and the
    namespaces specification?</p>
    <p>Decentralized extensibility allows new modules to be added
    to a language by third parties, but why bother when the modules
    which are generally proposed for addition to HTML, such as SVG,
    MathML and XForms, can be counted on the finger of one hand? In
    this case the HTML design authority can simply add new modules
    themselves. If extensions are needed, then they can just be
    added to the specification. The list of modules can be made
    available to everyone, as all systems are expected to be
    programmed with an inbuilt knowledge of the HTML spec.</p>
    <p>While browser plug-ins can be dynamically downloaded from
    the net, in general HTML extensibility from one level to the
    next has not been done in that way at all. Using the
    foundational rule that browsers have from the beginning ignored
    tags they did not understand, new HTML tags have been added in
    a calculated way so as to hopefully maximize the benefit and
    minimize the damage to the community as a whole. Historically,
    the HTML working group did not make any commitment that the
    meaning of tags would not change over time, only that change
    would be made as responsibly as possible.</p>
    <h2>Decentralized extensibility</h2>
    <p>Let us investigate the philosophy, now, of the XML
    branch.</p>
    <p>In a world in which there are very many XML-based
    technologies, and many many groups needing to create new ones
    and extend old ones, a major motivating requirement has been
    decentralized extensibility. This is the requirement for a
    group to be able to define the terms involved in the new
    technology without having to get an audience with and agreement
    with central committee. (Examples of centralized extensibility
    include for example Dewey decimal system, the Library of
    Congress cataloging system, and the international phone number
    space).</p>
    <h3>URI-based extensibility</h3>
    <p>In the Web environment, decentralized extensibility can be
    done using HTTP URIs. Basically this means that any group which
    can lay claim to some (normally HTTP) URI space can pick a URI
    for a new feature, without having to go through any centralized
    clearing house (other than the domain name system). It also
    means that the namespace URI can be used to give pointers to
    developers, or, with very persistent caching, machines, willing
    to learn about the new features.</p>
    <p>Historically, namespaces were actually a requirement on XML
    Namespaces imposed by RDF, which was developed in parallel with
    XML RDF is aimed at a multitude of communities all
    independently agreeing on different though connected set of
    terms, and then being able to merge their datasets which use
    these terms. URI-based extensibility has been very successful
    in the world of RDF itself, as many ontologies have been
    developed without central coordination by the RDF working
    group, which indeed closed long ago. One might argue that
    arbitrary non-RDF XML applications cannot use URI-based
    extensibility in the same way, as they do not have the very
    powerful "ignore triples you don't understand" model of RDF,
    but a counterexample would be the use of independent
    namespace-qualified tag names in SOAP messages, headers and
    content. Another example would be the <a href="http://www.exslt.org/">EXSLT</a> group which uses use
    namespaces to extend XSLT.</p>
    <h3>Follow-your-nose principle</h3>
    <p>The use of HTTP URIs for extensibility is not just a
    question of allocating names unambiguously. The fact that HTTP
    URIs have ownership means that there is a responsible authority
    who can be traced and called upon to explain what a term is
    supposed to be for and how it relates to other terms.</p>
    <p>In fact, as we use HTTP URIs, one can in real time look up
    that information. Although, for the sake of the servers, the
    looking up of a namespace document should be viewed as an
    installation process with a permanent cache, a machine can
    usefully pick up information at run-time which will allow a
    system to usefully process a vocabulary which it has not before
    encountered. This again is much more developed in the RDF
    world, where ontologies can contain enough information for a
    new user interface to be created on the fly.</p>
    <p>The follow-your-nose principle, then , allows a form of
    bootstrapping. Like any bootstrap, though, it needs a base to
    start from. In this case, there is a core set of specifications
    which a client has to understand in order to do the
    bootstrapping. Examples of these core specs are Ethernet, TCP
    and IP, DNS, HTTP,, the Internet Content Type (also known as
    <em>MIME type</em>) registry.</p>
    <p>By one model, a content type of <code>text/html</code> in a
    HTTP response indicates an HTML document. A content type of
    <code>application/xhtml+xml</code> indicates an XHTML
    document.</p>
    <p>By another model, a content type of
    <code>application/xml</code> indicates an XML document, and if,
    within such a document, namespaces are used for the document
    element, then the XHTML namespace URI
    (<code>http://www.w3.org/1999/xhtml</code>) within it indicates
    an XHTML document.</p>
    <h2>Recent controversies</h2>
    <p>Recently, controversies have arisen as various groups have
    attempted to create new feature sets suitable for adding to
    HTML and similar languages. One of these is <a href="https://www.w3.org/WAI/standards-guidelines/aria/">ARIA</a>,
    which allows a Web page to be annotated to explain the user
    interface function of various elements, and another is <a href="https://www.w3.org/TR/rdfa-core/">RDFa</a>, which allows a Web
    page to be annotated to explain the meaning of various elements
    and add more data. Each of these technologies, like many other
    technologies one can imagine, works by adding new attributes
    (and sometimes elements) to the markup.</p>
    <p>In <strong>ARIA</strong>, about 30 new attributes are added.
    In the XML fork, in one design, these were added using a aria
    namespace, as, say, aria:foo, while in the HTML5 fork, they
    were added as aria-foo in the HTML namespace. The arguments
    about these choices were fairly long and complex, and involved
    for example discussions of what exactly legacy browsers would
    do with the DOM in each case. The users of the spec are not
    just document writers, but also those who write scripts to
    access and interpret the attributes. In any event, there was no
    way one could write the same thing in both language both at the
    markup and the script level, it seemed.</p>
    <p>In <strong>RDFa</strong> (derived from "RDF in attributes"),
    the requirement was to add new attributes to allow semantics to
    be given for embedded HTML data. The GRDDL specification, an
    existing recommendation for pointing to a transform script
    which extracts RDF from a document, is a possible point of
    leverage in the follow-your-nose story, if one takes GRDDL as
    being, for semantic Web clients, as being part of the bootstrap
    core functionality.</p>
    <p>In the XML fork, extensibility is achieved using namespaces,
    but in the non-XML fork, there are a number of less obvious
    options, which include the addition of all new attributes to
    the the HTML world, as though they were in the HTML5 spec. In
    this case, the social question is: can a group just announce
    that it is adding attributes to the HTML namespace<a href="#L15593">*</a>, or does it have to get it put there or at
    least agreed with the HTML design authority? In the normal
    world of standards, the latter is the rule, as each
    specification needs, it is felt, a coordinating body. In the
    HTML world, though, introduction of new tags by vendors, and
    new attribute values (such as rel="nofollow") is often done
    without such coordination; the 'marketplace' decides which tags
    live and which don't, and the low probability of collision
    replaces the use of clearing houses for new names and values.
    [fn1]</p>
    <p>In practice, then, ARIA and RDFa have proposed to add new
    attributes (and/or elements) to HTML, deeming them to be added
    by dint of the existence new specifications, seeing whether
    they get adopted by a community of readers and writers once
    specified, and seeing whether they appeal to the those involved
    in the mainstream HTML language evolution to be worth either
    inclusion or reference.</p>
    <p>So, can we just use a different model for HTML, because of
    its special place among languages?</p>
    <p>While these are two recent examples, one soon discovers many
    examples of the development or integration of new technologies
    in this area:</p>
    <p>The SVG community has made a very modular specification
    intended to be mixed with other markup languages, originally
    using namespaces.</p>
    <p>The Mobile HTML specs have used XHTML very cleanly, and
    XHTML has been integrated with SVG in some cases, following the
    XML fork. SOAP systems enclose all kinds of XML in their
    payload, and can include XHTML within that where textual data
    is present in a remote service invocation or response.</p>
    <p>Meanwhile, by contrast, suggestions have surfaced that SVG
    should be integrated into HTML5 simply by pouring the SVG tags
    into the HTML specification, using no explicit extensibility
    controls at all.</p>
    <p>So it is impossible to draw a line around HTML as a special
    case isolating it from the mass of different communities
    developing their individual applications. So what can be
    done?</p>
    <h2>Scale free space</h2>
    <p>The Web is, as I have <a href="Fractal.html">mentioned</a>
    before, composed of many different communities of different
    sizes, and often is seen to have a scale-free properties. That
    is, for example, that there is no 'typical' number of inbound
    links to a page, but the distribution follows a power law. This
    is partly a measured phenomenon of the Web; it is also a
    phenomenon which occurs in many other systems, and also I have
    an unproved hunch that it represents a form of optimal
    arrangement for society to function effectively. It may be the
    optimum tradeoff between the ungainliness but great
    interoperability of a central language and the agility of small
    communities using of a Babel tower of different languages.</p>
    <p>It is a characteristic feature of such scale-free systems
    that they have one leading player, closely followed in the
    popularity ranks by other players in decreasing popularity.</p>
    <p>In the case of vocabularies on the Web, we have the HTML as
    the largest scale, in which tags are just tags and everyone is
    supposed to know them. One could argue that SVG actually
    belongs at this level and should be and will be as widely
    deployed as HTML.</p>
    <p>At the next level we have languages which are not HTML but
    still address the needs of very large communities. SVG, MathML
    and ARIA are examples.</p>
    <p>There are many medium-sized communities. The FaceBook Markup
    Language (FBML) is an example of a vocabulary proposed by one
    website, though a significant site. Atom feeds for various
    things can be considered at this level. Also, enterprise
    systems include many many XML namespaces which are developed,
    for example, in SOAP-based applications.</p>
    <p>Continuing on (roughly) down the scale we get to
    vocabularies for protein scientists and history museums, for
    scout troops and bird fanciers; we get vocabularies invented
    for today's experiment in a lab, for the import of a particular
    spreadsheet and so on.</p>
    <p>It is reasonable for us to not just sit back and admire the
    scale-free nature of the space, but to actively engineer for
    it. What does this mean in this case? It means that we should
    engineer the system with an understanding that HTML is a
    dominant language (at the moment) used by a very large
    community of individuals, but with an understanding that there
    are many other communities, many other languages and
    specifications, and that these often have to be able to connect
    with the HTML architecture.</p>
    <p>I would like to investigate the possibility of us
    deliberately designing ourselves a system which is optimal, in
    that it addresses the needs of all parties, and brings the two
    branches of the fork into the same space, so that that there is
    a continuum of extensibility. We start by looking at the issues
    and problems with that arise when attempting to use XML and
    namespaces as the basis for the HTML5 fork.</p>
    <h2>XML Issues</h2>
    <p>So what are some of the issues with XML which drive the
    HTML5 fork away from becoming closer to the XML fork?</p>
    <table border="1" style="margin-left:1.5in">
      <tbody>
        <tr>
          <th>Issue</th>
          <th>Motivation</th>
        </tr>
        <tr>
          <td>It is a pain to have to add quotes around
          attributes</td>
          <td>Ease of use</td>
        </tr>
        <tr>
          <td>It is a pain to have to spell the entire tag in the
          end tag</td>
          <td>Ease of use</td>
        </tr>
        <tr>
          <td>Parsers must stop on error</td>
          <td>unfriendly, impractical</td>
        </tr>
        <tr>
          <td>
            Namespace URIs take <a href="http://www.tbray.org/ongoing/When/200x/2003/12/13/MegaXML">
            too much space</a>
          </td>
          <td>impractical</td>
        </tr>
        <tr>
          <td>Non-nested begin/end tags have to be
          accommodated</td>
          <td>Legacy TAG soup</td>
        </tr>
      </tbody>
    </table>
    <p>At the top are the ones which one could imagine being cured
    by a redesign of XML. To the bottom are the things which I
    would resist changing in HTML. In the middle are areas where
    one could imagine some compromise.</p>
    <p>One fundamental difference of philosophy between the forks
    has been the attitude to deviations from the specification. In
    the past, people making Web pages have made many deviations
    from the specifications, so long as they worked. The result is
    a a legacy of Web pages which have all sorts of errors. It has
    been essential in the market for browsers that they work with
    these pages. The approach taken in HTML5 has been to document
    the behavior of these browsers, so that everyone knows what it
    is. The goals is that all old pages still work, but there can
    now be a well-defined algorithm and a test suite, instead of a
    heap of connected kludges implemented separately at great cost
    by each browser maker. This world is, then, very liberal, in
    what the Web page writers are allowed to do, and in what the
    client software has to accept.</p>
    <p>The initial approach taken in the XHTML fork was very
    different: it was completely conservative. Recognizing that the
    situation which had arisen with legacy HTML was a big mess,
    XHTML started anew. A new content type was allocated for XHTML.
    The XML specification required that any processor deliver no
    results if the input was not well-formed XML. The idea was for
    XHTML to start a new branch of clean content which would
    eventually outgrow the old, and which would be a platform for
    much cleaner growth, with namespace-based extensibility, and
    addition of SVG, MathML, XForms, and enterprise-specific
    extensions in a well-defined way. Organizations and individuals
    who have adopted XHTML are often vocal in their praises for the
    benefits which they experience, but this has evidently not lead
    to any substantial inroads into the dominance of HTML in the
    general public web.</p>
    <h2>Robustness Principle</h2>
    <p>The Internet specifications, since <a href="http://www.ietf.org/rfc/rfc0793.txt">RFC793</a>, have been
    developed with guidance from a <a href="http://en.wikipedia.org/wiki/Robustness_Principle">principle</a>
    that one should be conservative in what one generates, but
    liberal in what one accepts. This is often a useful maxim, when
    writing a program to send or receive messages, and when there
    is a possible area of the spec open to interpretation. So one
    would send lines of limited length, but accept lines of any
    length; send always the same case as in the examples, but
    accept either upper or lower case, and so on.</p>
    <p>This maxim works when two programs are communicating with
    short-lived messages, and when there is feedback between
    engineers when a system doesn't work. It has not worked so well
    on the Web, because the Web page designers in fact paid no heed
    to being conservative. They were not in general engineers who
    had read the spec at all, but random people copying each
    other's Web pages, and seeing what worked when they modified
    them. Further, the Web pages have a long, hopefully very long,
    lifetime. Once a Web page is out there with badly nested tags,
    it is out there for good. So on the web, there are some page
    creators who are no longer present, and others who are around
    and are open to feedback, new languages, and constructive
    feedback. Should the robustness principle be used or, if not ,
    what?</p>
    <h3>Incentives</h3>
    <p>To look at a system which includes people, one must study
    the incentives for those people. Suppose there is, on the one
    hand (and on the X axis) a certain effort which a Web page
    author puts into the writing of a Web page, to eliminate
    various levels of error, and on the other hand (and on the Y
    axis) a reward given, in part, in terms of the quality of the
    rendered Web page on the range of clients perceived to be of
    interest.</p>
    <p><object data="diagrams/motivation/conservative.svg" width="80%" type="image/svg+xml">
      <img alt=" big step" src="diagrams/motivation/conservative.svg" width="80%">
    </object></p>
    <p>In the case, shown above, of the conservative, XML fork
    browser, the page must be completely correct or nothing is
    rendered. The writer who has an almost perfect page is
    motivated to fix it, but the writer who has a page with several
    errors is not, as there will be no noticeable reward for
    incremental improvement. It is not very surprising that the
    majority of Web users whose pages would have started off near
    the left of the graph did not make it to the right when serving
    their code as XHTML.</p>
    <p><object data="diagrams/motivation/liberal.svg" width="80%">
      <img alt="another big step" src="diagrams/motivation/liberal.svg" width="80%">
    </object></p>
    <p>Some errors we may consider hopeless even in HTML, in that
    no useful recovery seems possible for them. In the case of the
    liberal browser (above), the reward for a hopeless page is
    zero, but for a page with any other level of errors, it in fact
    is rendered completely by the browsers. Therefore, a writer
    whose page is hopeless is motivated to clean it up a little
    bit. But the writers of pages which have other levels of error
    are not motivated to clean them up at all.</p>
    <p>So while the liberal and conservative forks have very
    different philosophies, they share one thing: They do not
    motivate the writer of a Web page to progressively improve
    their offering.</p>
    <h2>Bringing the fork together</h2>
    <p>The solution, as I see it, is to look at the motivating
    slope and fix it. When the user is provided with incremental
    rewards,</p>
    <p><object data="diagrams/motivation/slope.svg" width="80%">
      <img alt="A slope is better" src="diagrams/motivation/slope.svg" width="80%">
    </object></p>
    <p>then he or she will move, hopefully, up the slope.</p>
    <p><object data="diagrams/motivation/slope2.svg" width="80%">
      <img alt="A slope is better" src="diagrams/motivation/slope2.svg" width="80%">
    </object></p>
    <h2>Motivating slope</h2>
    <p>What does this mean?</p>
    <p>It means distinguishing more than the two possible outcomes
    of success and failure. We need to make a slope, so we need
    different levels.</p>
    <p>It means recognizing all the errors as errors, but also
    allotting them an importance level, so that users can
    concentrate on fixing the more important ones, or perhaps the
    ones which give the best improvement per effort ratio.</p>
    <p>There has been push-back against the idea of showing error
    indicators on Web pages, because no one wanted to be the
    browser to give the sub-optimal user experience. This can
    change in several ways. It can change because of user attitude
    changes. Al Gore points out repeatedly that we need to clean up
    the planet. People understand when we have to do some clean up.
    A browser which does not have these features would be seen as
    irresponsible in this context.</p>
    <p>So step one is to have a tool bar which slides down when a
    page has errors, giving a rating to the page out of 100, and
    allowing drill-down by interested users. It is true that most
    users are not interested and are not able to do something about
    a random site they visit. However, they might still be
    interested in the fact that the site is not clean. People who
    buy a business may be interested in knowing whether that
    business pollutes the planet. Similarly, people may be
    interested in knowing whether the HTML that they publish or the
    sites that they visit are polluting the Web.</p>
    <p>Another possibility is to allow users to specify which Web
    sites they are connected with. Anyone involved in the
    production of a Web site (up to the CEO and board for the
    company!) should be able to put that site into the list of
    sites for which they want more detailed feedback.</p>
    <h3>Changing the browser</h3>
    <p>We can also be smarter. We can make it so much easier for
    people to do the right thing.</p>
    <p>The classic way the Web spreads is by the "<em>View
    Source</em> effect" . You like someone's Web page , you do a
    View Source operation in the browser, and then you copy it and
    paste it into your own Web page. This is the way Web technology
    has spread, and also the way of course all those problems have
    spread. Suppose, whenever I look at the source of a page, I see
    a cleaned up version? Suppose it is impossible (or very
    difficult) to actually see the original source without it being
    heavily marked with the places where it has syntactic errors.
    Suppose if I copy it in a clipboard, then I get the cleaned up
    version? Suppose this applies to "Save As" too? The code to
    clean up a Web page is not that big by today's standards. There
    are many implementations, Dave Raggett's <a href="http://tidy.sourceforge.net/">tidy</a> being a
    well-established one, also now as in Marc Gueury's <a href="http://users.skynet.be/mgueury/mozilla/">HTML Validator
    Firefox Extension</a>.</p>
    <p>(One way to do it of course is to simply re-serialize the
    DOM tree of the page as loaded. This of course loses the
    formatting, which in general is a disadvantage, particularly
    when one needs to compare versisn of source files, or use
    source code control systems whcih do so).</p>
    <p>It wouldn't have to be perfect. It would have to move a page
    substantially along the curve toward the clean end of the
    spectrum.</p>
    <p>There are some things which browser manufactures could do
    right now, which could in fact change the ecosystem of
    developers and pages so that in a year or two a significant
    number of new pages were being produced cleanly, and in a few
    more years as the new content starts to dominant, the majority
    of the pages you see on the Web would be clean.</p>
    <p>We are not talking here about a switch to
    application/text+xhtml, but continuing to use the MIME type
    text/html and progressively improving the content we produce
    that so that it becomes cleaner.</p>
    <p>Would that be a good idea, and what exactly would it
    mean?</p>
    <p>Well, in fact if everything was XML some people might regard
    it as actually less useful than the current HTML, when it comes
    to quotes around attributes. So this forces us to look at
    whether XML could actually change itself, to meet HTML between
    their current positions. So I would suggest that some of the
    things we would have put on the slope, some of the cleanliness
    goals, we simply remove and declare them non-goals. But to do
    that, we have to change XML.</p>
    <h3>Changing validators</h3>
    <p>It turns out a that the opinion of the W3C validator has a
    large amount of clout in the community. Specifications such as
    microformats and ARIA have been affected very much by what can
    be done without breaking validation. Now the validator to date
    has been a DTD-based validator, so it checks that the document
    conforms to a given grammar. It requires, to be happy with the
    page, a DTD declaration which specifies what grammar the author
    of the page thought the page was written with.</p>
    <p>DTD validation will not allow the normal forms of XML
    extension, the addition of new elements and attributes. This is
    very ironic in a way. The "X" in "XML" is for "Extensible". The
    whole point is that an application written in XML can be
    extended by adding new element and attribute types. With
    namespaces, these elements and attributes become grounded in
    the Web, and URI space provides a way of avoiding any
    collision.</p>
    <p>In this vision of a way forward, validators, or perhaps one
    should say <em>page checkers</em>, as <em>validate</em> is a
    word claimed by XML for DTD-validation, should give a grade to
    a page, judging it on several counts, at various levels. At the
    error level:</p>
    <ul>
      <li>Content-Type wrong</li>
      <li>Character encoding (if marked UTF-8 is it really
      UTF-8?)</li>
      <li>Well-formedness: Bad nesting, missing end tag</li>
      <li>HTML elements misplaced according to some kind of
      grammar</li>
    </ul>
    <p>At a warning level:</p>
    <ul>
      <li>Extension tags used with no namespace</li>
      <li>Extension tags used, in a namespace without a namespace
      document</li>
    </ul>
    <p>At an informative level:</p>
    <ul>
      <li>Extension tags used with a namespace and namespace
      document</li>
      <li>Extension tags defined in other W3C recommendations</li>
      <li>Quotes missing from attribute values which do not contain
      spaces</li>
    </ul>
    <p>In fact, it may be that the browser, now a computing
    platform of some power, is in fact the best development
    platform for a page check in the future. It is possible that
    the same code in fact could be deployed in a third party server
    checker harness as in a client-side checker.</p>
    <h2>Changing XML syntax</h2>
    <p>The arguments against changing XML are very strong. Its
    single great value is its single common specification, its
    stability. It isn't perfect but it is common across so many
    different applications. Attempts at create an XML1.1 failed,
    just trying to introduce a few new Unicode characters.</p>
    <p>The arguments for changing it are that the alternative could
    be worse: It could be that the HTML5-style syntax with errors
    of all kinds being completely ignored propagates into first SVG
    then RSS then RDF and then SOAP. The entire stack has to be
    built so as to be able to do HTML5 error ignoring, with special
    knowledge that comes with that of various HTML tags. Even if
    you aren't using HTML itself, you just have to use that
    parser.</p>
    <p>What would be changed? It would be recommended that parsers
    recover from errors where they can, and indicate all errors
    above a certain level of seriousness to the user.</p>
    <p>Now everyone I have spoken to about this has their own list
    of things they would like to change in XML, if we were to do
    it, so deciding what goes in would be a interesting communal
    decision. Here is a list of some things which have come up.
    Some are better ideas than others, in my humble opinion.</p>
    <ul>
      <li>Allow attribute quotes to be omitted for simple
      values.</li>
      <li>Allow namespace to be implicit, given Content Type</li>
      <li>Short-hand for switching from one namespace to another?
      (grounded in Namespace document)</li>
      <li>Short form of close tag <code>&lt;/&gt;</code> ?</li>
      <li>Remove DTDs.</li>
      <li>URIs for grammar, cross-schema links for mixed NS</li>
      <li>Remove PIs (Have a xml:pi element if you like?)</li>
      <li>Multiple root elements or mixed content as document?</li>
    </ul>
    <p>(See Tim Bray <a href="http://www.textuality.com/xml/xmlSW.html">2002</a>,Norm Walsh
    on XML2.0 in <a href="http://norman.walsh.name/2004/11/10/xml20">2004</a>, <a href="http://norman.walsh.name/2008/02/20/xml20">2008</a>),...</p>
    <p>Lets go through these in order to clarify what we are
    talking about.</p>
    <h3>Optional Quoting of Attributes</h3>
    <p>The quoting of attribute values I have already mentioned.
    The quotes in SGML were not necessary. When SGML was simplified
    to XML, the quotes were made mandatory. this simplified the
    parser, but it complicated life for writers, and required more
    keystrokes, disk space and bandwidth. It also made the source
    more difficult to read by increasing clutter. It also made the
    source more difficult to read</p>
    <h3>Implied namespace</h3>
    <p>The implied namespace idea comes from a consideration of the
    follow-your-nose argument above. If an HTML document is
    delivered with a Content-Type which labels it as HTML, then why
    on earth does this information have, in XHTML, to be repeated
    in the document as the root namespace element? It is a waste of
    space and an imposition on the user. However, whether the page
    has an explicit namespace or not, I would like to be able to
    parse it and look for elements in the DOM using the XHTML
    namespace. So I would like all HTML elements to be deemed to be
    in the XHTML namespace. This is actually I think a sensible
    change to the architecture, that:</p>
    <ol>
      <li>With XML-based content, the MIME type registry contains
      an implied namespace. for text/html, this is the XHTML
      namespace</li>
      <li>The XML parser interface is extended to include an extra
      parameter, the implicit namespace</li>
    </ol>
    <p>(Note that while this is a default for the namespace, as the
    term <em>default namespace</em> already means the namespace for
    elements with no prefix, we can't use it for this concept of
    the namespace for the default namespace when there is no
    namespace declaration.)</p>
    <p>This would make SVG documents smaller as well, and who knows
    what else. It could be useful for cutting down the transmission
    time for small XHTML documents to mobile devices, and so
    on.</p>
    <p>What about mixed documents? Well, the HTML mime type could
    be registered to that the implicit default namespace is XHTML,
    but also there is an implicit s: namespace for SVG and m: for
    math, and so on. A machine-readable list could be made
    centrally available, changed occasionally, an downloaded at
    install time (not run time, to save the servers!) in XML-2
    parsers.</p>
    <h3>Switching namespaces</h3>
    <p>The fact that documents have a well defined meaning as
    grounded in the Web traces back to the terms being defined
    using URIs. This does not, however, mean that URIs have to be
    embedded in their full glory at every step. Namespaces already
    serve as an abbreviation system. Now we could add a sort or
    chaining within documents. For example, one could define an
    &lt;svg&gt; element in the HTML namespace which would have the
    implicit effect of switching namespaces to the SVG
    namespace.</p>
    <p>I am not sure that this is a good idea, as it makes the
    amount of off-line information needed more complex, as one
    would have to have a way of specifying this in a schema
    language of some sort, and it would be impossible to parse the
    document correctly without that schema document. But it could
    be valuable avenue to explore if there continue push-back
    against namespaces.</p>
    <h3>Remove DTDs</h3>
    <p>The DTD syntax within XML is a historical artifact. It was
    part of SGML, used for defining grammars of SGML applications,
    but not itself using SGML syntax. The DTD language was kept in
    XML as at the time there was nothing to replace it. Since then,
    DTDs have been joined by XML Schema, Relax-NG, and other
    languages for specifying constraints for applications of XML.
    Meanwhile, DTDs have fallen behind in that they do not
    naturally accommodate namespaces. A large amount of
    infrastructure has been constructed around them in the XHTML
    fork's <a href="https://www.w3.org/2014/10/modularity-slides/">HTML
    Modularization</a> spec.</p>
    <p>The main reason for keeping DTDs in XML systems has been
    that they are needed for defining entities, and specifically
    character entities. The solution to this I would suggest is to
    define a namespace of tools to do this in XML. One could even
    take part of the xml: namespace.</p>
    <p>This would also mean that one would lose the feature of
    default values for attributes and fixed values for attributes.
    These are a strange feature of the language in many ways. They
    make this unfortunate difference between a raw infoset and a
    post-validation infoset. They allow the DTD designer to say
    "even of you didn't put this attribute in, you still meant it".
    Of course the semantics of the application language can always
    be defined to have defaults, even when they are not provided by
    a DTD processing step.</p>
    <p>There have of course been many discussion of this topic over
    the years.</p>
    <h3>Processing Instructions</h3>
    <p>Processing Instructions (PIs) are a strange corner of the
    XML specification which could be removed to advantage. PIs
    provide a form of "machine-readable comments" which sit between
    code (normal markup and text) and comments (which should be
    completely ignored in the application semantics).</p>
    <p>Where one is tempted to use a PI, one should use a namespace
    to add an attribute for example to the root element. That
    allows one to have many levels of hint to different possible
    processors and interpreters about different things. After all,
    why have three levels when you can have n? (In fact, in RDF, I
    would often recommend that comments be left as rdfs:comment
    statements so that they are preserved in the processing and
    enlighten people reusing the data in a completely different
    contexts)</p>
    <p>PIs are a kludge. The question of what is inside them, and
    what it means,</p>
    <h3>Close tag abbreviation</h3>
    <p>A commonly suggested shortcut, while we are discussing
    shortcuts, is to allow the closing tag &lt;/foo&gt; to be given
    as &lt;/&gt;. I understand this was a question of debate in the
    original XML design, and did not get in at the time. It is less
    self-documenting and less robust in the face of certain errors,
    but it can save a lot of space for enterprise applications
    where tag names can become very long. Also, for
    machine-generated code where operator error is not a problem
    and indenting can be done automatically, it clearly cuts down
    on the size of the file.</p>
    <h3>Multiple root elements, or mixed content as XML
    document<a href="#L16102">*</a></h3>
    <p>A characteristic which XML does not currently have is the
    ability concatenate two valid XML documents and get a new big
    XML document. This property would have its uses. To make it
    possible, one could allow mixed content (A mixture of elements
    and text) a the outermost level. Advantages include that</p>
    <ul>
      <li>It would be possible to transmit arbitrary XML content
      (for example from a selection, or the answer to a question on
      a form) as an XML document itself.</li>
      <li>One could concatenate XML documents and ship them as one
      when the information to be transferred was the union of the
      information in each;</li>
      <li>One could XML as a format in which the default was plain
      text, but markup could be allowed if necessary. For example,
      the title of a book, often plain text but sometimes needing a
      character entity, or a form field or e-mail in which the
      default might be plain text but occasionally one wants to add
      HTML emphasis.</li>
    </ul>
    <h3>What source could look like</h3>
    <p>The markup for a page which currently in XML will typically
    start like</p>
    <pre>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html PUBLIC
 "-//W3C//DTD XHTML 1.0 Transitional//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
&lt;head&gt;
  &lt;link href="../../../People/Berners-Lee/general.css"
  rel="stylesheet"  type="text/css" /&gt;
&lt;/head&gt;
...</pre>
    <p>in the future when served as <code>text/html</code> could
    look like simply</p>
    <pre>&lt;html&gt;
&lt;head&gt;
 &lt;link href=../../../People/Berners-Lee/general.css
  rel=stylesheet type=text/css /&gt;
&lt;/head&gt;
...</pre>
    <p>and be considered perfectly valid XML2. It could be parsed
    by general XML2 applications, which would be passed the
    implicit namespace which would have come from a content-type
    lookup table. To HTML authors, the only non-HTML thing they
    ould have to do is remember the /&gt; on the end of the link
    tag. So there is in te end a compromise between the forks, but
    one in which everyone can do most of what they want. So this
    may be a better place to be. Is it worth trying to get
    there?</p>
    <h3>Costs of change</h3>
    <p>Changes to XML syntax would of course be a vary major step.
    It would break a level of stability in the XML specification
    which has been one of its major advantges. it would potentially
    affect very large number of parsers.</p>
    <p>On the other hand, it would only affect the parsers. The XML
    data model is not changed by the surface changes to the syntax.
    XML1 files would be valid XML2, so srializers woudl not have to
    change. Languages such as XPATH, XSLT and XQuery which are
    defined on the data model would not change. However, just
    changing XML parsers would be very dramatic step for the
    industry. It would leave behind many programs whos development
    has stopped.</p>
    <p>But then again, if the alternative s that all systms have
    two parsers one for HTML5-like data and one for XML, that is a
    huge cost too.</p>
    <p>What about the cost of change to browers?</p>
    <p>Browsers currently have very many ways of treateing web
    pages, to adapt to different forms the language out there on
    the web. In one sense, a merge fork track would be another
    variation, one chosen to be more stable in the long term. The
    tricks to recognize particular types of old content will
    presumably be necessary into the future.</p>
    <p>Changes to the browsers to bring them toward a common DOM
    for HTML and XML are also going to be significant. To a certain
    extent, perhaps one can allow the namespaced API calls to
    follow the XML+Namespaces model, but the non-namespaced calls
    to follow the HTML model. The complications here are to great
    to go into here. <sup><a href="#L2726">dw</a></sup></p>
    <h2>Conclusion</h2>
    <p>Future developers will not only use the languages we define
    today, they will build on them to make new more sophisticated
    ones. The cleaner the systems we develop are, then the easier
    it will be. The HTML and SVG document models, for example, are
    powerful user interface libraries, and exciting novel new
    applications are being built on top of them. The difficulties
    inolved in dealing with the different APIs of different forks
    doesn't help.</p>
    <p>We, the Web technology community at large, have a duty to
    lead the technology toward cleaner engineering solutions. While
    we should reatain an ability to read old web pages, we should
    move the community of producers (both hand-coders and authoring
    tools) so that the newly produced web ages become progressively
    cleaner.</p>
    <p>To do that, we have to understand the motivations of website
    developers and browser writers and server administrators. We
    have to understand how changes to the software and the
    specifications can tweak the way people behave. We can also set
    new community goals and a new community attitude about unclean
    Web pages, so long as at the same time we move the goal of
    cleanliness to make it less irksome.</p>
    <p>The direction outlines here involves quite a lot of work. It
    means developing new parsers, page checkers and browsers which
    encourage cleanliness. It means cleaning up authoring tools. It
    involves solving many intricate technical details of how these
    Web pages look to a script in the DOM. But the alternative --
    the current forked track --will be a lot of work too. Keeping
    both forks maintained with separate diverging code stacks.
    Writing scripts which explicitly check whether they in an HTML5
    or XHTML environment every few lines. Developing increasingly
    complex new extension methods for HTML5 to emulate namespaces.
    As the future unrolls, porting new deevlopments, like the
    <em>&lt;video&gt;</em> tag, within HTML5 to XHTML, and new
    developments, like RDFa, within XHTML to HTML. Or putting up
    with the burden of continual re-invention of new functionality
    in quite incompatible ways, on both sides of the stack.</p>
    <p>We need to set ourselves goals of merging the forks, with
    some give on each side. We need to switch from strictly liberal
    and strictly conservative attitudes to one in which
    progressively cleaner pages are considered progressively
    better. We need to adopt an attitude that we are going to clean
    up the Web just as we sometimes need to clean a bedroom -- or a
    planet.</p>
    <h3 id="grouchy">Grouchy Robustness Principle</h3>
    <blockquote>
      Be conservative what you produce. Be liberal about what you
      accept but complain about any deviations from the spec in a
      way to help and to motivate the producers to adhere to it
      better.
    </blockquote>
    <p>Tim Berners-Lee</p>
    <p>Original August 2008, made public May 2019</p>
    <hr>
    <h2>Footnotes etc</h2>
    <p>This is a deliverable of <a href="http://www.w3.org/2001/tag/group/track/actions/145">TAG issue
    145</a>.</p>
    <p>This is $Id: HTML-XML.html,v 1.5 2019/05/20 21:36:35 timbl
    Exp $</p>
    <p><small><a name="L15593" id="L15593">1.</a> (There is a
    parallel with adding them all to the XHTML namespace, but this
    is unfortunately not a precise one, because attributes without
    explicit namespaces are deemed in the NS spec to be in <em>no
    namespace</em>., rather than to be in the namespace of the
    element. The fact that the XHTML &lt;a&gt; element has an @href
    attribute, for example, does not mean that there is an
    attribute xhtml:href which one could consider mixing into other
    languages. Some, including me, regard this as a bug in the
    namespace specification.)</small></p>
    <p><small><a name="L15989" id="L15989">2.</a> (Footnote: At the
    schema level there are issues too. There is not space to go
    into those here, but to oversimplify, DTDs are broken by design
    as they actually don't use XML syntax; XML schema got
    complicated; RelaxNG is a competing standard, but still needs
    NVDL to enable mixed namespaces. There is no simple way to say
    the fundamental statements for connecting two languages such as
    "An SVG circle can go anywhere an HTML IMG can
    go".)</small></p>
    <p><small><a name="L16102" id="L16102">3</a>. (Thanks to Norm
    Walsh for the multiple root element suggestion)</small></p>
    <p><small><a name="L2726" id="L2726">DW:</a> Perhaps the
    biggers single obstacle is the document.write() method which
    binds code and markup much too intimitely to all them to
    eveolve separately. It is too close to self-modifying code.
    Ironically, it is often used to use a compact declarative form
    (document.write("&lt;p&gt;&lt;a
    href="../"&gt;&lt;b&gt;here&lt;/b&gt;?&lt;/a&gt;&lt;/p&gt;")as
    an alternative to sequence of method calls to biuld up the same
    thing. This is easier to write, and easier to read. If it were
    compiled into a data object (see E4X) this would be clean
    coding. As it is, the intricaied of when document.write()
    inserts what into what stream end up defining huge amnounts of
    how code is written, and allow one to do all kkinds of
    non-obvious things.</small></p>
  </div>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Sep 2002 00:00:00 GMT</pubDate>
  <title>What do HTTP URIs identify?</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/HTTP-URI.html</link>
    <guid>https://www.w3.org/DesignIssues/HTTP-URI.html</guid>
      <description><![CDATA[
  <h1>What do HTTP URIs Identify?</h1>
  <h3>Background Note</h3>
  <p>This question has been addressed only vaguely in the
  specifications. However, the lack of very concise logical
  definition of such things had not been a problem, until the
  formal systems started to use them. There were no formal systems
  addressing this sort of issue (as far as I know, except for Dan
  Connolly's Larch work [@@]), until the <a href="/2001/sw">Semantic Web</a> introduced languages such as RDF
  which have well-defined logical properties and are used to
  describe (among other things) web operations.</p>
  <p>The efforts of the <a href="/2001/tag">Technical Architecture
  Group</a> to create an architecture document with common terms
  highlighted this problem. (It demonstrates the ambiguity of
  natural language that no significant problem had been noticed
  over the past decade, even though the original author or HTTP ,
  and later co-author of HTTP 1.1 who also did his PhD thesis on an
  analysis of the web, and both of whom have worked with Web
  protocols ever since, had had conflicting ideas of what the
  various terms actually mean.)</p>
  <p>This document explains why the author find it difficult to
  work in the alternative proposed philosophies. If it
  misrepresents those others' arguments, then it fails, for which I
  apologize in advance and will endeavor to correct.</p>
  <h2>1. Web Concepts as here proposed</h2>
  <p>The WWW is a space of information objects. The URI was
  originally called a UDI, and originally all URIs identified
  information objects. Now, URI schemes exist which identify more
  or less anything (e.g. UUIDs) or electronic mailboxes (mailto:)
  but is we look purely at HTTP URIs, they define a web of
  information objects. Information objects -- perhaps in Cyc terms
  <a href="">ConceptualWorks</a> -- are normally things which</p>
  <ul>
    <li>Carry some sort of message, and</li>
    <li>Can be represented, to a greater or lesser authenticity, in
    bits</li>
  </ul>
  <p>I want to make it clear that such things are generic (See
  <a href="/DesignIssues/Generic">Generic Resources)</a> -- while
  they are documents, they generally are abstractions which may
  have many different bit representations, as a function of, for
  example:</p>
  <ul>
    <li>Time -- the contents can vary with revision --</li>
    <li>Content-type in which the bits are encoded</li>
    <li>Natural language in which a human-readable document is
    written</li>
    <li>Machine language in which a machine-processable document is
    written</li>
    <li>and a few more</li>
  </ul>
  <p>but the philosophy is that an HTTP URI may identify something
  with a vagueness as to the dimensions above, but it still must be
  used to refer to a unique conceptual object whose various
  representations have a very large a mount in common. Formally, it
  is the publisher which defines the what an HTTP URI identifies,
  and so one should look to the publisher for a commitment as to
  the exact nature of the identity along these axes.</p>
  <p>I'm going to refer to this as a <strong>document</strong>,
  because it needs a term and that is the best I have to date, but
  the reader should be sure to realize that this does not mean a
  conventional office document, it can be for example</p>
  <ul>
    <li>A poem</li>
    <li>An order for ball bearings</li>
    <li>A painting</li>
    <li>A Movie</li>
    <li>A review of a movie</li>
    <li>A sound clip</li>
    <li>A record of the temperature of the furnace</li>
    <li>An array a million integers, all zero</li>
  </ul>
  <p>and so on, as limited only by our imagination.</p>
  <p>The Web works because, given an HTTP URI, one can in a large
  number of cases, get a representation of the document. For a
  human readable document, the person is presented with the
  information by virtue of some gadget which is given the bits of a
  representation. In the case of a hypertext document, a reference
  to another document is encoded such that, upon user request, the
  referenced document can in turn be automatically presented. In
  the case of a machine-readable document, identifiers of concepts,
  being HTTP URIs, will often allow definitive reference
  information about those concepts to be pulled in to guide further
  actions.</p>
  <p>The web, then, is made of documents as the internet is made of
  cables and routers. The documents can be about anything, so when
  we move to talk about the contents of documents we break away
  from talking about information space and the whole universe of
  human -- and machine -- discourse is open to us. Web pages can
  compare a renaissance choral works with jazz pop hits, and
  discuss whether pigs have wings. Machine-processable documents
  can encode information about shoes, and ships, and sealing-wax.
  Until recently, the Internet protocol standards out of which the
  Web is built had little to say about such things. They were
  concerned only with the human-readable side, so it was people,
  reading natural language (not internet specs) who formed and
  communicated the concepts at this level. Nowadays, however,
  semantic web languages allow information to be expressed not only
  about URIs, TCP ports and documents, but also about arbitrary
  concepts - the shoes, and ships and sealing wax, and whether pigs
  have wings. Simple semantic web application allow one to order
  shoes and travel on ships, and determine that, given the data,
  pigs do not have wings.</p>
  <p>For these purposes it is of course quite essential to
  distinguish between something described by a document and the
  document itself. Now that we -- for the first time -- have not
  only internet protocols which can talk about document but also
  those which talk about real world things, we must either
  distinguish or be hopelessly fuzzy.</p>
  <p>And is this bad, is it an inhibition to have to work our way
  though documents before we can talk about whatever we desire? I
  would argue not, because it is very important not to lose track
  of the reasons for our taking and processing any piece of
  information. The process of publishing and reading is a real
  social process between social entities, not mechanical agents. To
  be socially responsible, to be able to handle trust, and so on,
  we must be aware of these operations. The difference between a
  car and what some web page says about it is crucial - not only
  when you are buying a car.</p>
  <p>Some have opined that the abstraction of the document is
  nonsense, and all that exists, when a web page describes a car,
  is the car and various representations of it, the HTML, PNG and
  GIF bit streams. This is however very weak in my opinion. The
  various representations have much more in common than simply the
  car. And the relationship to the car can be many and varied: home
  page, picture, catalog entry, invoice, remote control panel,
  weblog, and so on. The document itself is an important part of
  society - to dismiss its existence is to prevent us being aware
  of human and aspects of information without which we are
  impoverished. By contrast, the difference between different
  representations of the document (GIF or PNG image for example) is
  very small, and the relationship between versions of a document
  which changes through time a very strong one.</p>
  <h2>2. Trying out the Alternatives</h2>
  <p>The folks who disagree with the model do so for a number of
  different arguments. This article, therefore will have to take
  them one by one but the ones which come to mind are as
  follows:</p>
  <ol>
    <li>
      <a href="#L728">Every web page (or many of therm) are in fact
      themselves representations of some abstract thing, and the
      URI really identifies that</a> thing, not a document at all.
    </li>
    <li>
      <a href="#L876">There are many levels of identification
      (representation as a set of bits, document, car which the web
      page is about) and the URI publisher, as owner of the URI,
      has the right to define it to mean whatever he or she
      likes;</a>
    </li>
    <li>
      <a href="#L883">Actually the URI has to, like in English,
      identify these different things ambiguously. Machines have to
      disambiguate using common sense and logic</a>
    </li>
    <li>
      <a href="#L890">Actually the URI has to, like in English,
      identify these different things ambiguously. Machines have to
      disambiguate using the fact that different properties will
      refer to different levels</a>.
    </li>
    <li>
      <a href="#L897">Actually the URI has to, like in English,
      identify these different things ambiguously. Machines have to
      disambiguate using extra information which will be provided
      in other ways along with the URI</a>
    </li>
    <li>
      <a href="#L909">Actually the URI has to, like in English,
      identify these different things ambiguously. Machines have to
      disambiguate them by context: A catalog card will talk about
      a document. A car catalog will talk about a car</a>.
    </li>
    <li>
      <a href="#L920">They may have been used to identify documents
      up till now, but for RDF and the Semantic Web, we should
      change that and start to use them as the Dublin Core and RDF
      Core groups have for abstract concepts</a>.
    </li>
  </ol>
  <h3 id="L728">2.1 Identify abstract things not documents</h3>
  <p>Let's take the alternatives in order. These alternatives all
  make sense. Each one, however, has problems I can't see any way
  around when we consider them as a basis as</p>
  <p>The first was,</p>
  <blockquote>
    <p>Every web page (or many of them) are in fact themselves
    representations of some abstract thing, and the URI really
    identifies that thing, not a document at all.</p>
  </blockquote>
  <p>Well, that wasn't the model I had when URIs were invented and
  HTTP was written. However, let's see how it flies. If we stick
  with the principle that a URI (or URIref) must unambiguously
  identify the same thing in any context, then we come to the
  conclusion that URIs can not identify the web page. If a web page
  is about a car, then the URI can't be used to refer to the web
  page.</p>
  <h4>2.1.1 <a name="s2.1.1" id="s2.1.1">Same URI can identify a
  web page and a car</a></h4>
  <p>What, a web page can't be a car? At this point a pedantic line
  reasoning suggests that we should allow web pages and cars to
  conceptually overlap, so that something can be both. This is
  counterintuitive, as a web page is in common sense, not a
  concrete object whereas a car is. But sure, we could construct a
  mathematics in which we use the terms rather specially and
  something can be at the same time a web page and a car.</p>
  <p>Frankly, this doesn't serve the social purpose of the semantic
  web, to be able to deal with common sense concepts and objects. A
  web page about a car and a car are in most people's minds quite
  distinct (as I argue further below). A philosophy in which they
  are identical does not allow me to distinguish between them. not
  only conflicts with reality as I see it, but also leaves us no
  way to make statements individually about the two things.</p>
  <h4><img alt="A car has a different identifier -- and very different properties." src="diagrams/http-uri-1.png"></h4>
  <h4>2.1.2 <a name="identifies" id="identifies">The URI identifies
  the car, not the web page</a></h4>
  <p>So lets fall back on the idea that the URI identifies the
  <em>subject</em> of the web page, but not the web page itself.
  This makes sense. We can build the semantic web on top of that
  easily.</p>
  <p>The problem with this is that there are a large number of
  systems which already do use URIs to identify the document. This
  is the whole metadata world. Think of a few:</p>
  <ul>
    <li>The Dublin Core</li>
    <li>RSS</li>
    <li>The HTTP headers</li>
    <li>The Adobe XML system</li>
    <li>Access control systems</li>
  </ul>
  <p>(I'm sticking with the machine-processable languages as
  examples because human-processable ones like HTML have a level of
  ambiguity traditional in human natural language but quite out of
  place in the WWW infrastructure -- or the Semantic Web. You can
  argue that people say "I work for w3.org" or
  "http://www.amazon.com/shrdlu?asin=314159265359" is a great book,
  just as they happily say "<em>Moby Dick</em> weighs over three
  thousand tons", "<em>Moby Dick</em> was finished over a century
  ago" and "I left <em>Moby Dick</em> on the beach" without
  expecting to be misunderstood. So we won't use human language as
  a guide when defining unambiguously the question of what a URI
  identifies. If we want to do that on the Semantic Web, we will
  say "I work for <em>the organization whose home page is</em>
  http://www.ww3.org.)</p>
  <p>Some argue the the URI which I associate with someone's home
  page actually identifies that person. They argue that
  conventionally people use the identifier to identify the person.
  However, consider another page put together by friends who found
  a photograph of the same person. A lot of content filtering
  systems would collect that URI and put put into their list. Even
  though the photo had many representations which different devices
  could download using content negotiation and/or CC/PP (color or
  black and white and versions of different resolutions) the URI
  itself would be listed as containing nudity. The public are very
  aware of different works on the web, even though they have the
  same topic.</p>
  <h4>2.1.3 <a name="Indirect" id="Indirect">Indirect
  identification</a></h4>
  <p>You can argue that a web page <em>indirectly</em> identifies
  something, of course, and I am quite happy with that. If you
  identify an organization as that which has home page
  http://www.w3.org, then you are not saying that
  http://www.w3.org/ itself is that organization. This scenario is
  very very common, just as we identify people and things by their
  "unambiguous properties": books by ISBN, people by email address,
  and so forth. So long as we don't think that the person
  <em>is</em> an email address, we are fine. Some people have
  thought that in saying "An HTTP URI can't identify an
  organization" I was ruling out this indirect identification, but
  not so: I am very much in favor of it. The whole SQL world, after
  all, only identified things indirectly by a key property. This
  causes no contradiction. Perhaps I should say "An HTTP URI can't
  directly identify an organization". But by "identify" I mean
  "directly identify", and "identity" is a fairly direct word and
  concept, so I will stick with it.</p>
  <p>Conclusion so far: the idea that a URI identifies the thing
  the document is about doesn't work because we can only use a URI
  to identify one thing and we have and already do use it to
  identify documents on the web.</p>
  <h4>2.1.4 <a name="argument" id="argument">The argument for HTTP
  URIs identifying a Conceptual Work</a></h4>
  <p>So what's wrong with the URI being taken to identify whatever
  the owner says?</p>
  <p>Let's look at what we mean by <em>identifies</em>. When we say
  there is identity, that means that there is some form of sameness
  that we associate with the identifier. Now, for all the
  philosophical argument, we can never test the identity of an
  abstract thing. What we can test is a representation which has
  been returned by the server when given that URI. When we use
  aURI, and get back several possible representations of it, then
  what expectation do we have about those representations?</p>
  <p>Take the test case that I see the web page which has a picture
  of a car, and I see in the URI in the URI bar in the browser. I
  email you the URI, "you see, the car is a Toyota?". You click on
  the link. Your browser shows the same URI as mine in the "URL
  bar" but you see a table of the car's weight, length, height,
  color, and registration number. We are confused. The web didn't
  work because you didn't get the same information as me. I
  expected you to get the same information, basically. That is how
  the Web works. That is the expectation behind every hypertext
  link - that the follower of the link should get basically the
  same information as the person who made the link. I say,
  "basically" because I would not have cared whether you saw or
  JPEG or a GIF. It probably wouldn't have mattered if you had seen
  a lower resolution or even black-and-white copy of the picture.
  If you are visually impaired, you may have been able to manage
  with a well-written description of the picture. But the the
  essential information is the same, not just the subject of the
  page.</p>
  <p>So now we have put the four corners on the expectation we have
  of a URI -- that all representations have essentially the same
  <em>information content</em>. And what we mean by "essentially"
  allows in fact some wriggle room, and in the end it rests on a
  common understanding between publisher of the information and
  quoter of the URI. The sameness we are after is the sameness of
  information content. <em>That</em> is what is identified by the
  URI. That is why we say that the URI identifies that conceptual
  information content, irrespective of its particular
  representation: the <em>conceptual work</em>. Without that common
  understanding, the web does not work.</p>
  <p>Some people have said, "If we say that URIs identify people,
  nothing breaks". But all the time they, day to day, rely on
  sameness of the information things on the web, and use URIs with
  that implicit assumption. As we formalize how the web works, we
  have to make that assumption explicit.</p>
  <h3 id="L876">2.2 Author definition</h3>
  <p>So how can we break free of that line of reasoning? We can try
  throwing away the rule that a URI identifies only one thing.</p>
  <blockquote>
    <p>There are many levels of identification (representation as a
    set of bits, document, car which the web page is about) and the
    URI publisher, as owner of the URI, has the right to define it
    to mean whatever he or she likes.</p>
  </blockquote>
  <p>Well, this one is tempting from the point of view that the
  owner of an identifier should reign supreme when it comes to
  saying what it identifies. It is quite a logically consistent
  position to take. After all, isn't this the case with
  <code>uuid</code>'s? And for a new scheme, this would be
  interesting. How can we do it though, with HTTP? the problem is
  an engineering one: I can't in practice use a URI until I have
  some definitive information from the publisher as to what it
  identifies.</p>
  <p>2.2.1 Default</p>
  <p>Why can't a URI default to identifying a web page until you
  know otherwise? Because the web is open and you will never know
  when you might lean some other information which will make the
  default incorrect. (You can't use such "closed world"
  reasoning).</p>
  <p>2.2.2 Web operation</p>
  <p>Why can't a URI identify a web page until you have done some
  well-defined operation -- such as HTTP HEAD or GET -- and checked
  for information in that? Well, that would certainly work
  logically. Suppose we we define a return code or HTTP header
  which means "abstract object requested". It would mean that every
  web application which deals with web pages as web pages would
  actually be working under an ambiguity, and RDF processors could
  be programmed to look for that special information. We can't
  retrofit the millions of web servers out there, I assume.</p>
  <p>I feel that there is a great benefit to fixing this question
  at the spec level. Otherwise, what happens? I read a web page, I
  like it and I am going to annotate it as being a great one -- but
  first I have to find out whether the URI my browser is used,
  conceptually by the author of the page, to represent some
  abstract idea? Before I recommend the <em>Vietnam War</em> page,
  I have to be careful I am not recommending the Vietnam War.</p>
  <p>There has been no way to do this before RDF, but then
  similarly no real need for it. (What, is this just a problem with
  RDF? No, it will happen with any webized knowledge representation
  system.). We really need to have communication in which two
  people use the same URI to mean the same thing. If there</p>
  <p>We could fix HTTP so that it would return me some extra
  semantic headers explaining the whole thing. And in the case that
  the URI was deemed to be some abstract thing, I would not have
  the option of recommending the web page. Too bad: it has no
  URI.</p>
  <p>The authors of document
  &lt;http://www.w3.org/2000/10/rdf-tests/rdfcore/Manifest.rdf&gt;
  certainly thought that they could use
  "http://www.w3.org/2000/10/rdf-tests/TestSchema/NegativeParserTest"
  to identify an abstract thing which is a type of software test.
  Now they have a choice as to what to make the server return for
  them when I ask for it. It returns 404 "doesn't match anything we
  have available". It can't really, because HTTP doesn't allow one
  to return a class, only a document. And if it were to return a
  document, then I wouldn't be able to refer to that document
  without accidentally referring to the class of negative parser
  tests.</p>
  <p>So, we could change HTTP to make this work. We could make a
  new form of redirect, <em>343 Abstract Object, please see . .
  .</em>, which would tell the client that the thing requested was
  abstract, and would suggest a document to read about it. This
  avenue of argument is still outstanding. We could take it. It
  isn't the status quo, but we could make changes in HTTP if the
  community felt that this was they way to go.</p>
  <h3 id="L883">2.3 Logic disambiguates</h3>
  <p>Otherwise,we have to try another way of letting the URI mean
  sometimes one thing and sometimes another. Here is another.</p>
  <blockquote>
    <p>Actually the URI has to, like in English, identify these
    different things ambiguously. Machines have to disambiguate
    using common sense and logic</p>
  </blockquote>
  <p>This is possible in theory. It is a mess. It fails
  particularly spectacularly when a URI is used ambiguously to
  refer to a web page and the thing that web page is about, which
  happens to be another web page. <em>Anyone can write anything
  about anything</em> is a Web motto, but here it falls down.
  <em>Anyone can write anything about anything except those things
  which might get confused with the document they are writing</em>.
  It breaks the axiom that we mean the same thing by a URI - in all
  contexts. (And RDF has a model theory in which necessarily in any
  interpretation, a symbol always denotes one thing).</p>
  <h3 id="L890">2.4 Different Properties</h3>
  <blockquote>
    <p>Actually the URI has to, like in English, identify these
    different things ambiguously. Machines have to disambiguate
    using the fact that different properties will refer to
    different levels.</p>
  </blockquote>
  <p>One way of getting here is to start by considering that HTTP
  headers can be divided into those which refer to the
  representation (or the document) and those that refer to, say, a
  car or a donkey. We can look at all RDF properties and other
  attributes in other languages and divide them in in such a way.
  So, when I say "http://example.com/albert is a color photo", I am
  referring to the representation; when I say
  "http://example.com/albert used to work down the mill" I am
  referring to the person; when I say "http://example.com/albert
  was taken on a rainy day" I am revering to the original
  photograph, which is basically the representation of Albert.</p>
  <p>This one has the problem when a web page refers to a web page.
  It can still be pursued, by having different verbs for talking
  about ownership of the web page and ownership of the car. This is
  a classic example of the 2-level syndrome (see also
  <em>Dictionaries in the Library</em>). The basic fallacy is that
  you can make the system general by introducing a second level - a
  new set of attributes, properties, or whatever, which allow you
  to refer to the metadata of something separately from the thing
  itself. These systems either turn out to be just limited 2-level
  systems (like XML and DTDs) or have to be extended to be
  recursive in some way later on such that in fact the two levels
  become unnecessary.</p>
  <h3 id="L897">2.5 Extra info with URI</h3>
  <blockquote>
    <p>Actually the URI has to, like in English, identify these
    different things ambiguously. Machines have to disambiguate
    using extra information which will be provided in other ways
    along with the URI</p>
  </blockquote>
  <p>This twist now relies on sending extra information with a URI.
  Effectively, the URI scheme has now failed to identify anything
  by itself. Those most familiar URIs as used by HTML sometimes
  suggested adding new attributes to the anchor tags of HTML
  documents to disambiguate a reference. I guess it would work if
  HTML anchors were the only uses of URIs. By contrast, they are
  used in thousands of places and way, many of which I am unaware.
  The architecture, however, is not that way: the architecture of
  the WWW is that a URI is a global unambiguous identifier. Not a
  URI and something else.</p>
  <p>(The various designs such a WebDav's propfind which use HTTP
  methods apart from GET to retreive information suffer from this
  same problem. the information does not have a URI: it is not on
  the web.)</p>
  <h3 id="L909">2.6 Different meaning in different context</h3>
  <blockquote>
    <p>Actually the URI has to, like in English, identify these
    different things ambiguously. Machines have to disambiguate
    them by context: A catalog card will talk about a document. A
    car catalog will talk about a car.</p>
  </blockquote>
  <p>This works in the short term, when the two contexts are
  disjoint groups who do not need to communicate. It is in fact the
  current state: the groups of people who use HTTP URIs to talk
  about documents, and those who have just started to use them to
  talk about abstract concepts haven't collided yet. (Well, they
  have in my code. I need to be able to model the metadata about an
  HTTP URI as that about a document, and it being a class at the
  same time doesn't jive.)</p>
  <p>It doesn't work in the long term because it breaks the axiom
  that a URI must identify one thing,</p>
  <h3 id="L920">2.7 Change it for the Semantics Web</h3>
  <blockquote>
    <p>They may have been used to identify documents up till now,
    but for RDF and the Semantic Web, we should change that and
    start to use them as the Dublin Core and RDF Core groups have
    for abstract concepts.</p>
  </blockquote>
  <p>I think that we would have to design a new URI scheme before
  we change things that much. That is tempting of course. But then
  -- building a semantic web out of what we have is tempting too.
  It was tempting to rehash TCP a little when making HTTP. It
  wasn't practical, and we would have lost a lot more than we would
  have gained. There is a lot to be said for using common
  technology. We've got an infrastructure of documents. We want to
  build an infrastructure of knowledge. Let's build it using the
  documents. We might find that the commonality with the web of
  human-readable information is a boon.</p>
  <h3 id="L735">2.8 Abandon any identification of abstract
  things</h3>
  <p>An argument which surprised me is that yes, HTTP URIs identify
  documents, but in fact the frgament identifier must only be used
  to identify parts -- fragments -- of documents. This means that
  RDF cannot in fact use HTTP URI schemes at all. A completely
  different system would have to be put together -- either a new
  set of URIs, or RDF conventions in which the relationship to the
  part of a document in which something was described became
  explicit. In N3 this would like like</p>
  <p>[ is rdf:referent of &lt;#fmyCar&gt; ] [ is rdf:referent of
  &lt;#color&gt; ] [ is rdf:referent of &lt;#blue&gt; ]</p>
  <p>Of course, languages would quickly generate special syntax for
  this. Alternatively, the RDF system would built entirely on the
  understanding that we were referring always to that denoted by a
  given bit of document, not the bit of document itself. This would
  mean that there would be no way for the RDF system to refer to
  documents themselves directly.</p>
  <p>This is actually a consistent way of working. It would be a
  change only for those people who use RDF to talk about documents
  as documents. We could change.</p>
  <h2><a name="L409" id="L409">3. Conclusion</a></h2>
  <p>I didn't have this thought out a few years ago. It has only
  been in actually building a relatively formal system on top of
  the web infrastructure that I have had to clarify these concepts
  my own mind. I am forced to conclude that modeling the HTTP part
  of the web as a web of abstract documents if the only way to go
  which is practical and, by the philosophical underpinnings of the
  WWW, tenable.</p>
  <p>I apologize again if I have misunderstood or misrepresented
  other's arguments in this process of this explanation of my own
  position.</p>
  <p>Tim Berners-Lee</p>
  <p>2002-07-28Z</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Jun 2005 00:00:00 GMT</pubDate>
  <title>What HTTP URIs identify? II</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/HTTP-URI2.html</link>
    <guid>https://www.w3.org/DesignIssues/HTTP-URI2.html</guid>
      <description><![CDATA[
  <h1>What HTTP URIs Identify</h1>
  <h2><a name="Abstract" id="Abstract">Abstract</a></h2>
  <p>HTTP URIs, in the web architecture, have been used to denote
  documents -- "web pages" informally, or "information resources"
  more formally. However, with the growth of the Semantic Web,
  which uses URIs to denote anything at all, the urge to use and
  practice of using HTTP URIs for arbitrary things grew steadily.
  The W3C Technical Architecture group eventually decided to
  resolve the architectural problem that if an HTTP response code
  of 200 (a successful retrieval) was given, that indicated that
  the URI indeed was for an information resource, but with no such
  response, or with a different code, no such assumption could be
  made. This compromise resolved the issue, leaving a consistent
  architecture.</p>
  <h2 id="Introducti">Introduction</h2>
  <p>HTTP URIs, in the web architecture, have been used to denote
  documents -- "web pages" informally, or "information resources"
  more formally. However, with the growth of the Semantic Web,
  which uses URIs to denote anything at all, the urge to use and
  practice of using HTTP URIs for arbitrary things grew steadily.
  The Dublin Core project, one of the first RDF vocabularies, and
  later Friend of a Friend, and various others simply used HTTP
  URIs to identify RDF Properties. The result was that one could no
  longer be sure that an HTTP URI was intended to identify the web
  page one got when one used the URI in a browser. In fact, there
  was a danger of confusion is one party used the URI for an
  abstract concept and another used it for the web page. The author
  wrote a long <em>Design Issues</em> note about this, <a href="HTTP-URI.html">What do HTTP URIs Identify?</a>. The reader is
  directed to read that if more detail of the arguments is
  needed.</p>
  <p>This whole issue caused, until 2005, a lot of discussion in
  technical circles, and much heated debate. In June 2005, the TAG
  resolved the issue as a function of the runtime protocol
  response. Basically, the argument is that if you have used a URI
  to get a web page, then you can use the URI to identify the
  Information Resource which is that web page: For example, the New
  York Times home page, or this page you are reading now.</p>
  <h3 id="Resolution">Resolution</h3>
  <p>The TAG resolution effectively extends the range of things one
  can use HTTP URIs. However, it does not allow one to simply serve
  a web page at a URI which is used for something else. Of course,
  it is a general principle of web architecture that it is useful
  to serve information to those that look up a URI. In the case
  that the URI is not intended to be used for an information
  resource.</p>
  <p>The W3C Technical Architecture group eventually decided to
  resolve the architectural problem that if an HTTP response code
  of 200 (a successful retrieval) was given, that indicated that
  the URI indeed was for an information resource, but with no such
  response, or with a different code, no such assumption could be
  made. This compromise resolved the issue, leaving a consistent
  architecture.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 10 Apr 2015 00:00:00 GMT</pubDate>
  <title>Mapping between HTTP URLs and filenames on a server</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/HTTPFilenameMapping.html</link>
    <guid>https://www.w3.org/DesignIssues/HTTPFilenameMapping.html</guid>
      <description><![CDATA[
  <h1>Mapping between HTTP URLs and filenames on a server</h1>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 01 Dec 2016 00:00:00 GMT</pubDate>
  <title>Icing on the cake pattern: URIS of services and metadata close to the URI of the target</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/IcingOnTheCake.html</link>
    <guid>https://www.w3.org/DesignIssues/IcingOnTheCake.html</guid>
      <description><![CDATA[
  <h1>Icing on the cake</h1>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Jan 2008 00:00:00 GMT</pubDate>
  <title>Identity: how to identify what in RDF</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Identity.html</link>
    <guid>https://www.w3.org/DesignIssues/Identity.html</guid>
      <description><![CDATA[
    <h3>
      <a name="Identifier" id="Identifier">Identifiers - what is
      identified?</a>
    </h3>
    <p>
      When XML is used to represent a directed laballed graph which
      is used to represent information about things, then one must
      be able to make statements about parts of an XML document,
      parts of the DLG (such as RDF nodes) and of course the
      objects described.
    </p>
    <p>
      In most cases it seems obvious to the human reader. The jam
      jar label text does not (normally) read "jam jar label text"
      or "jam jar label" or "jam jar" but "jam".
    </p>
    <p>
      Take the case of a statement about a person in an imaginary
      syntax
    </p>
    <pre>&lt;z:person id="foo"&gt;
   &lt;head&gt;
      &lt;play:author&gt;Zoe&lt;/play:author&gt;
   &lt;/head&gt;
   &lt;play:name&gt;Albert&lt;/play:name&gt;
   &lt;play:mailbox resource="mailto:adoe@bar.com"/&gt;
   &lt;play:son-name&gt;Bill&lt;/play:son-name&gt;
   &lt;play:daughter-name&gt;Claire&lt;/play:daughter-name&gt;
   &lt;play:father&gt;
      &lt;z:name value="Joe"/&gt;
      &lt;z:wrote href="#foo"&gt;
      &lt;z:friend resource="#foo"/&gt;
   &lt;/play:father&gt;
&lt;/z:person&gt;
</pre>
    <p>
      The XML element has one attribute and four child elements.
      The RDF node has three properties (stated here). The person
      Albert has two children. What so we refer to is we refer to
      "#foo"? Of course we refer to the element - but when we make
      RDF statements, we normally want to refer to the RDF node, or
      rather the object described by the node, in RDF terms the
      <em>resource</em>.
    </p>
    <p>
      Of course, in a typical unix programming language we would
      simply add a syntax character to distinguish the forms of
      reference: #foo would be the node, and @#foo (or something)
      would be the object refered to. But in this case we are
      trying to do everthing with RDF, and what is left with XML,
      and so we would lose a few points by adding instead some
      totally new syntax. What we <em>can</em> do is to use
      different attribute names for the different forms of
      reference. The attribute names I used above are as follows:
    </p>
    <table border="1">
      <caption>
        Forms of reference to the object of a property
      </caption>
      <tbody>
        <tr>
          <td>
            <code>value</code>
          </td>
          <td>
            litteral string
          </td>
        </tr>
        <tr>
          <td>
            <code>href</code>
          </td>
          <td>
            taking the string as a URI with or without fragment
            identifier, the text (or XML fragment or whatever
            medium) to which it refers.
          </td>
        </tr>
        <tr>
          <td>
            <code>resource</code>
          </td>
          <td>
            taking a string as a URI with fragment idenifier, the
            abstract RDF object (rdf:resource) corresponding to the
            identified XML document fragment.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      Here I have used "href"to allow RDF to refer to the XML
      model. This is important, as for example it is bits of XML
      which one digitaly signs, not (in sigend XML) bits of RDF.
      Also, it is useful for RDF to be able to talk about XML
      elements. It brings up the question of what an RDF fragment
      identifier means.
    </p>
    <h2 id="clash">
      RDF and XML fragment identifiers clash
    </h2>
    <p>
      <em>This highlights (2000/02) a bug in the relationship
      between XML and RDF</em>
    </p>
    <p>
      Consider what is identified by
    </p>
    <p style="text-align: center">
      <code>http://.../foo.rdf#bar</code>
    </p>
    <p>
      when <code>...foo.rdf</code> contains among other things the
      following:
    </p>
    <pre>&lt;rdf:description rdf:id="bar"&gt;
   &lt;rdf:type resource="...#person"&gt;
   &lt;y:common-name&gt;Ora Lassila&lt;/y:common-name&gt;
   &lt;y:mailbox&gt;ora.lassila@research.nokia.com&lt;/y:mailbox&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      The meaning of the fragment identifier is taken from the
      specification assocaitedwith the MIME type.
    </p>
    <p>
      Therfore, if this is takes as a document of type
      application/rdf, then the fragment identifier identifies the
      thing (person in this case, Ora) described RDF node. This is
      how refernces are used in RDF.
    </p>
    <p>
      However, if its considered to be of type text/xml then the
      fragment identifier is defined bythe XML spec, and so
      references an element whose attrubute XML:ID. has value
      "bar". It happens that the <code>rdf:id</code> is
      <em>not</em> defined to be an xml:id but is defined to "act
      like one", whatever that means, by the RDF spec. So it isn't
      clear whether the reference to this would be to the XML
      subtree (consisting of the rdf:description element and its
      contents) or would be undefined or possibly a refernce to
      some other element which happened to have id="bar".
    </p>
    <p>
      To have a different interpretation of a URI as a function of
      the notional type of the document belies the fact the point
      of using XML syntax for RDF was that RDF documents should be
      XML documents! Of course we embed RDF in regular XML
      documents. So this distinction is nonsense.
    </p>
    <p>
      Of course, the RDF spec can simply use the XML definition
      indirectly and refer to the RDF ndoe described by the XML
      element. Howvere, this is not powerful enough for RDF. This
      is because RDF needs to be able to make statements about XML
      documents and XML elements. So for example, I might want to
      state that I wrote the above snipet. It would be very
      tempting to write that I am the author of foo.rdf#bar. But I
      am not the author of Ora Lassila. RDF uses and parseType to
      resolve this for inline data: parseType=Resource indicates
      that the reference is to the RDF object, and
      parseType=Literal indicates that it is to the XML. The thing
      could be resolved with an interpretion property which
      expresses relationship between an XML subtree and an RDF
      object which it describes. While it would be good to define
      that property, RDF syntax needs a shortcut. I would propose
      that "resource=" which is used to point to a resource be also
      used for a resource fragment id, and that a new syntax be
      introduced to refer to the actual RDF node. maybe "object="
      which happens here to correspond to the (subject, predicate,
      object) sense -- as well as a "thing" sense. (The former is
      what is the reason for chosing it - the attribute should
      express the relationship, not the class of the thing refered
      to in general!).
    </p>
    <h2>
      <a name="Naming" id="Naming">Naming properties and
      elements</a>
    </h2>
    <p>
      We have a similar problem in the XML-RDF relationship looking
      atthe identity oat the schema level.
    </p>
    <p>
      In RDF M&amp;S 1.0, a property name defined in a namespace is
      formed by directly concatenating the namepsace URI with local
      tag name of the XML element.
    </p>
    <p>
      One natural way to use this is to end the namespace URI with
      "#" so that the local tag name becomes the fragment
      identifier. When the schema is written in XML, this implies
      that the tag name, being a simple alphanumeric, will identify
      something in the document by its XML ID. This is a constraint
      on the schema language: the XML ID of an element must be
      usable as a reference to the thing being defined.
    </p>
    <p>
      When there is a 1:1 mapping netween RDF properties and XML
      element types, there is a choice of
    </p>
    <ol>
      <li>giving them the same URI and distinguishing which is
      refereed to by context (as in resource= and object= above),
      or
      </li>
      <li>giving the different URIs algorithimically related, like
      assuming that #foo-element means the element defining #foo,
      using a convention specified in eth schema languages, or
      </li>
      <li>giving them totally distinct URIs which can be connected
      by an assertion in the schema, or an in
      </li>
    </ol>
    <p>
      Given that it is interesting to use RDF to make statements
      about XML element types, having different names it appealing.
      As writing down the relationship every time the algorithmic
      link is un appealing.
    </p>
    <h3>
      <a name="generic" id="generic">A generic problem with XML
      identifiers</a>
    </h3>
    <p>
      <span class="detail">(I notice in passing that XML has
      currently a mixture of identifier paces which is a little
      confusing.</span>
    </p>
    <p class="detail">
      The element and attribute namespace is very well handled in
      terms of abbreviations, and is grounded in URI space, using
      the XML namespaces spec.
    </p>
    <p class="detail">
      The URI space is of course the same space, but when value is
      typed as a URI, then it cannot use the abbreviation system of
      the elelemnt namespace.)
    </p>
    <h3>
      <a name="IDREF" id="IDREF">IDREF considered harmful</a>
    </h3>
    <p>
      The local identifier space is a subset of URI space. When an
      attribute is defined as a URI, the simple "#" prefix gives
      access to the local ID space - while still allowing great
      pwer of expression by reference to anything else on the Web.
      When the "idref" form is used, this is not possiible. The
      idref form is a weak form IMHO and not wise for new designs
      which are not to be deliberately constraining.
    </p>
    <p>
      Others have noticed this problem and there have even been
      suggestions which confused the URI prefix and the namespace
      prefix. In fact the problem can be solved [ref eric
      whiteboard] with an escape of some sort. One prossibility is
      ambushing a void URI schme name by using a colon prefix
      (suggested by Eric Prud'hommeaux)
    </p>
    <p class="detail">
      <code>href=":rdf:description"</code>
    </p>
    <p>
      would be a perfectly valid URI (in an XML context) which
      referenced the rdf:description URI using the defined rdf:
      namespace. I feel this is messy, as it would have to be
      subject to different handling than any other URI: its
      expansion would be done in an XML-specific way.
    </p>
    <p>
      The other link you need is the ability, when using an element
      name which only occurs once, and without changing the default
      namespace, it would clearly be logical to be able to just
      write
    </p>
    <p class="detail">
      <code>&lt;http://foo.com/schemas/memo6.2#priority&gt;a&lt;/[...]&gt;</code>
    </p>
    <p>
      Because what follows uses the full power of what precedes
      with generality, we may need to see the first in use before
      the paper is over. But I can't see making the second change
      to XML.)
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 26 Jun 2010 00:00:00 GMT</pubDate>
  <title>Limiting the damage of an inconsistency</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Inconsistent.html</link>
    <guid>https://www.w3.org/DesignIssues/Inconsistent.html</guid>
      <description><![CDATA[
    <h2>
      Inconsistent data
    </h2>
    <p>
      What, many people ask, will happen when this huge mass of
      classical logic meets its first inconsistncy? Surely, once
      you have one staement that A and another somewhere on the web
      that not A, then doesn't the whole system fall apart? Surely,
      then you can deduce anything you want?
    </p>
    <p>
      This fear of course is quite valid - or would be if all
      assertions in the whole world were regarded as bing on equal
      footing. Some imagine that an RDF parser will simply search
      all XML documents on the web for any facts, and add them to a
      massive set of belived assertions. This is not how realisic
      systems will actually work.
    </p>
    <p>
      On the web, a fact may be asserted in an expression. That
      expression may be part fo a formula. The formula may ivolve
      negation, and may invove quotation. The whole formula is
      found by parsing some document . There is no a priori reason
      to believe any document on the web. The reason to believe a
      document will be found in some information (metadata) about
      the document. That metadata may be an endosement of the
      document - another RDF statement, which in turn was found
      another document, and so on.
    </p>
    <p>
      <em>[@@need picture here]</em>
    </p>
    <p>
      A real system may work backwards or forwards (or both). I
      would call working forwards a system which is given a
      configuartion page to work from which in turn points to other
      pages which in turn are used as valid data. I would call
      working backwards a system which, when looking for an answer
      to a query, looks at a gloal index to find any document at
      all which mentions a given term. It then searches thes
      documents turned up for answers to the query. Only when it
      has found an answer does t check back to see whether the data
      can be deriveded directly or indirectly from sources it has
      been set up to trust.
    </p>
    <p>
      Digital sgnature (see trust) of course adds a notion of
      secuirty to the whole process. The first step is that a
      document is not endorsed without giving the checksum it had
      when believed. The second step is to secify more powerful
      rules of the form
    </p>
    <blockquote>
      <p>
        "whatever any document says so long it is signed with key
        57832498437".
      </p>
    </blockquote>
    <p>
      In prcatice, particular authroities are trusted only for
      specific purposed. The semantic web must support this. You
      must be able to restrict the information believed along the
      lines of,
    </p>
    <blockquote>
      <p>
        "whatever any document says of the form xxxx is a meber of
        W3C so long as it is signed wiht key 32457934759432".
      </p>
    </blockquote>
    <p>
      for example
    </p>
    <blockquote>
      <p>
        "whatever any document says of the form "a is an employee
        of IBM" so long as it is signed by with key 3213123098129".
      </p>
    </blockquote>
    <h3>
      Limiting inference
    </h3>
    <p>
      There is a choice here, and I am not sure right now which
      appeals to me most. One is to say precicely,
    </p>
    <blockquote>
      <p>
        "whatever any document <em><strong>says</strong></em> of
        the form xxxx is a member of W3C so long as it is signed
        with key 32457934759432".
      </p>
    </blockquote>
    <p>
      The other is to say,
    </p>
    <blockquote>
      <p>
        "whatever is of form xxxx and <em><strong>can be
        inferred</strong></em> from information signed with key
        32457934759432"
      </p>
    </blockquote>
    <p>
      In the first case, we are making an arbitrary requirement for
      a statement to be phrased in a particular way. This seems
      unnecessarily bureaucratic, and more difficult to treat
      constently. Normally we like to be able to replace any set of
      forumlae with another set which can be deduced from it.
      However, in this case we have to preserve the actual form in
      case we need to match it against a pattern. This is very
      messy.
    </p>
    <p>
      In the second case, we fall prey to the inconsistency trap.
      Once any pair of conflicting statements can be deduced from
      information signed with a given key, then anything can be
      deduced from information signed with the key: the key is
      completely broken. Of course, only that key is broken, so a
      trust system can remove any reason it has to trust that key.
      However, the attacked system may not realize what has
      happened before it has been convinced that the sun rises in
      the west.
    </p>
    <p>
      Is there a way to limit the domain of trust in a key while
      allowing inmformation to be processed in a consistent way
      throughout the system? Yes - maybe - there are many. Each KR
      system which uses a limited logic does do in order (partly)
      to solve this problem. We just qulaify "can be inferred" be
      the type of inference rules which may be used. This means the
      generic proof engine eitehr has to work though a reified
      version of the rules or it has to know the sets - incorporate
      each proof engine. Maybe we only need one.
    </p>
    <h3>
      Expiry
    </h3>
    <blockquote>
      <p>
        Tortoise: What's the time, Achilles?
      </p>
      <p>
        Achilles: Five past ten, my friend. [They chat for a
        minute]
      </p>
      <p>
        Tortoise: What is the time, Achilles?
      </p>
      <p>
        Achilles: Six minutes past ten, Mr. Toroise.
      </p>
      <p>
        Tortoise: But Achilles, you just told me just a minute ago
        it was <strong>five</strong> minutes past ten. How can I
        ever believe you again?
      </p>
    </blockquote>
    <p>
      Time-varying information is one cause of apparent
      contradiction. People and documents change status. How does
      one base inference on information which may be out of date?
    </p>
    <p>
      One part of this is to put explicit or implcit expry dates on
      everything. Whenever a server sends resource to an HTTP
      client, it can give an expiry date. The client can track
      this, and ensure that all deductions from that document are
      cancelled when the date arrives, unless a more recent copy
      can be optained which says the same thing. In human language
      you might say "It is rainy" but on the semantic web that
      woudl be exported in a fully qualified way, more like "at Mon
      Jan 24 09:41:06 EST 2000 the measurement guage 5 at Dubin
      Airport read rain as having fallen in the last hour". (A
      fuzzy system would conclude "Dublin is wet" and a clasic
      logic system "at least once it rained at at least one place
      in Dublin"!)
    </p>
    <p>
      I understand [Lehrmann, SW meeting in DC] (sp?) that the KIF
      folks developed a complete vocabulary for time-variance.
    </p>
    <p>
      Another tchnique is to make any looseness which exists in the
      real system visible. Instead of saying
    </p>
    <blockquote>
      <p>
        Any employee of any member orgainzation of W3C may register
      </p>
    </blockquote>
    <p>
      you say formally to the registration engine
    </p>
    <blockquote>
      <p>
        Any person who was some time in the last 2 months an
        employy of an organization which was som etim ein the last
        2 montsh a W3C member may register.
      </p>
    </blockquote>
    <p>
      In other words, if an organization were to drop its
      membership, the system doesn't have to support propagating
      that information instantly.
    </p>
    <p>
      I think there will be time-aware reasoning systems, and
      time-unaware raesoning systems which are fed data with expiry
      dates and whose results are used within the intersection
      period of the validity periods of the incomming data. Indeed,
      time-aware systems may contain nested time-unaware systems,
      and probably vice-versa.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Dec 1999 00:00:00 GMT</pubDate>
  <title>Semantics and Interpretation (and digital signature)</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Interpretation.html</link>
    <guid>https://www.w3.org/DesignIssues/Interpretation.html</guid>
      <description><![CDATA[
    <h1>
      Interpretation and Semantics on the Semantic Web
    </h1>
    <p>
      We need some philosophy as a basis for the architecture of
      digital signature and the semantic web.
    </p>
    <p>
      The semantic web is a computer system, a distributed machine
      which should function so as to perform socially useful tasks.
      There will be various interfaces between the Semantic Web
      (SW) world and the social world of people, such as the
      physical delivery of goods, and the presentation of a
      document to a person for signature. However, in general with
      these important exceptions the Semantic Web will form a
      self-sufficient loop. The semantics of anything on the SW are
      then defined either in terms of more stuff on the SW, or in
      terms of the connection with these real-world connections. So
      for example I might initially define a check as something
      which when fed into the bank's black box will make it do a
      certain thing. Then within the SW all definitions of dollars
      and transfers can be defined back in terms of the check, and
      a self-sufficient system can be made where is necessary the
      recourse can be made to sending a check to a bank, but in
      fact we can etrade using ecurrency and einvoices and
      edeliverynotes and so on.
    </p>
    <p>
      This is a similar relationship with reality that coins
      originally had with gold, and bills with coin. (A UK pound
      used to read "I promise to pay the bearer on demand the sum
      of one Pound signed, signed: Bank of England"). From then on
      a pound note became what people thought of as a pound, and
      the notion of what exactly the "sum of one pound" was
      originally defined by becomes irrelevant and the paper money
      is self-sufficient. So we are making a computer system which
      will function as a machine which does a process quite
      equivalent to (though perhaps more crisply defined than) a
      social process such as trade or endorsement.
    </p>
    <p>
      We use the applications which tie the SW to what we currently
      think of as reality for three reasons:
    </p>
    <ol>
      <li>We need an interface between the SW and the current
      social systems that is how the SW system will work at least
      initially.
      </li>
      <li>The social system machine has legislative backing (and
      public understanding etc.) which we want to exploit;
      </li>
      <li>The social system we have works and we only want to
      change the machine incrementally.
      </li>
    </ol>
    <p>
      Our reason is <em>not</em> that the current definitions are
      fundamental or because their specification is inherently
      beautiful (indeed many existing systems are really crufty).
      Importantly, we do <em>not</em> define the semantics of
      something to the real world in such a way as to break the
      loop, when the loop can be completed in the SW. Here is an
      example of a loop in the semantic web.
    </p>
    <ul>
      <li>a. Web server grants access to resource d in response to
      request is signed with key k1.
      </li>
      <li>b. key k1 is listed in a [employee list] document signed
      with k2;
      </li>
      <li>c. Key k2 is listed in a [w3c member] list signed with
      k3;
      </li>
      <li>d. Key k3 is the key with which the web server was set up
      to trust
      </li>
    </ul>
    <p>
      This little system can happily run controlling our web site.
      Now in fact we set it up to model the following social system
    </p>
    <ul>
      <li>A. A person P1 is allowed to read the member site
      </li>
      <li>B. The person P1 is an employee of company C2
      </li>
      <li>C. C2 is a member of the consortium according to Hugo;
      </li>
      <li>D. Hugo is deemed responsible when it comes to defining
      member site access.
      </li>
    </ul>
    <p>
      Now to represent the SW loop a-d is very simple. The
      conditions can be written in math and proved. The social loop
      A-D as written is always a rough approximation to the very
      complex web of trust which is often less dependable than the
      simpler SW model.
    </p>
    <p>
      Security has always been plagued by people trying to connect
      the SW steps (such as a-d) at every stage to the social
      machine (A-D). For example, this would raises the question of
      how to identify the person P1 with key k1, introducing the
      quite unnecessary x.500 directory system which is really not
      part of the trust loop but becomes a security hole, bringing
      in unnecessary "trusted" third parties. It drags up endless
      questions of what "identity" really is anyway. It would raise
      the question of whether it is Hugo or the webmaster or what
      that is associated with K3. Before we had finished arguing
      about identity we would be into arguments about "belief". We
      would be arguing as to whether Hugo really <em>believes</em>
      that the person is a member of the company - maybe Hugo does
      not have to but in his webmaster role he does! These are rat
      holes. (People don't just belive things to believe to a
      certain extent, they trust certain source for certain
      purposes). It would be best to use a different term
      ("interpretation"?) for the mapping between the semantic and
      real worlds. (I probably haven't got the philosophical terms
      right at all and I haven't said "model" once)
    </p>
    <p>
      So what happens, after we have installed our web server
      access protocol based on digital signature, is that we then
      relate things to that. We say that invited experts can get
      have keys on a given list. The semantic web becomes the
      definitive machine, and we just have rules at the edges about
      how it related to things like membership payments. An invited
      expert becomes defined as someone whose key is on a given
      list.
    </p>
    <p>
      What we are looking for from a digital signature spec is the
      relationship between a signature and a string of bits, and
      what we are looking for from a semantic web toolbox is the
      language for writing the conditions a-d. We are NOT looking
      for either to provide and interpretation language for
      relating a-d to A-D, ora legal language for writing the steps
      A-D.
    </p>
    <p>
      Now, the much-asked question, what is the "semantics" of the
      digital signature in a-d above? From the SW point of view,
      those rules are the semantics of the system. The whole thing
      is self-sufficient from the machine's point of view, except
      for the edges where the server has to understand what to
      "give access" is, and where the person has to sign a request
      or a list. The great thing about the semantic web is that we
      can make it all work and never actually answer the questions
      <cite>"invited" in what sense? by whom?</cite> and <cite>Does
      this mean an invitation which has been accepted?</cite> and
      such other rat holes. We must be careful not to confuse what
      is said with where it is stored There rare basically four
      rules which define the access machine. We could store them
      anywhere. They could be sent in an HTTP request, stored on
      any number of different web sites, in Java rings and
      smartcards, send by email or etched in marble. The SW design
      must not constrain where things are stored.
    </p>
    <p>
      Where do the "sematics of the signature" lie?
    </p>
    <p>
      The semantics in the SW are for me the whole loop a-d, which
      you see, to be a loop, and therefore to allow any processing,
      must eventually be tried down to the key. When you start to
      argue something on the basis of a signature by a key, they
      only next step can be some knowledge about the key. In the
      semantic web, this is a processing rule about things which
      are signed with that key. However, that does not mean that
      the signature has semantics which stored as/with/about the
      key. In fact, I do not think it is useful to talk about the
      "semantics of the signature.
    </p>
    <p>
      Documents have meaning. Signatures by themselves do not.
    </p>
    <p>
      So it is not useful to ask what the semantics of a signature
      are. Signatures convey trust, but even that because of a set
      of statements about keys and documents. There are in society
      many rules about the trust which is conveyed by the signature
      under various circumstances. We should not attempt to model
      those when we make the basic infrastructure of the semantic
      web.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 28 Feb 2000 00:00:00 GMT</pubDate>
  <title>Interpretaion properties for units and languages</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/InterpretationProperties.html</link>
    <guid>https://www.w3.org/DesignIssues/InterpretationProperties.html</guid>
      <description><![CDATA[
    <h1>
      <a name="Interpreta" id="Interpreta">Interpretation
      properties</a>
    </h1>
    <p>
      <em>Abstract: Natural languages, encodings, and similar
      relationships between one abstract thing and another, are
      best modeled in RDF as properties. I call these
      Interpretation properties in that they express the
      relationship between one value and that value interpreted (or
      processed in the imagination) in a specific way.</em>
    </p>
    <h2>
      <a name="problem" id="problem">The problem of annotating
      natural language</a>
    </h2>
    <p>
      There has to date (2000/02) been a consistent muddle in the
      RDF community about how to represent the natural language of
      a string. In XML it is simple, because you never have to
      exactly explain what you mean. You can mark up span of text
      and declare it to be French.
    </p>
    <blockquote>
      <p>
        His name was &lt;html:span
        xml:lang="fr"&gt;Jean-Fran&amp;ccedilla;ois&lt;/html:span&gt;
        but we called him Dan.
      </p>
    </blockquote>
    <p>
      Under pressure from the XML community to be standard, the RDF
      spec included this attribute as the official RDF way to
      record that a string was in a given language. This was a
      mistake, as the attribute was thrown into the syntax but not
      into the model which the spec was defining.
    </p>
    <p>
      Consider the <a href="Identity.html#this">example</a> in the
      <a href="Identity.html">identity section</a>,
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;rdf:type&gt;http://www.people.org/types#person&lt;/a&gt;
   &lt;play:name&gt;Ora Yrjö Uolevi Lassila&lt;/play:name&gt;
   &lt;play:mailbox resource="mailto:ora.lassila@research.nokia.com"/&gt;
   &lt;play:homePage resource="http://www.w3.org/People/Lassila"/&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      Now that represents five nodes in the RDF graph: the
      anonymous node for Ora himself (who has no web address) and
      the four arcs specifying that this thing is of type person,
      and has a common name, email address and home page as given.
    </p>
    <p>
      Where to we add the language property? Of course we could add
      a language attribute to the XML, but that would be lost on
      translation into the RDF model: no triple would result.
    </p>
    <h3>
      <a name="Attempt2" id="Attempt2">Attempt 1: a property of the
      person?</a>
    </h3>
    <p>
      Many specifications such as iCalendar (see my notes@link)
      would add another property to the definition of the person.
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;rdf:type&gt;http://www.people.org/types#person&lt;/a&gt;
   &lt;play:name&gt;Ora Yrjö Uolevi Lassila&lt;/play:name&gt;
   &lt;play:namelang&gt;fi&lt;/play:namelang&gt;
   &lt;play:mailbox&gt;ora.lassila@research.nokia.com&lt;/play:mailbox&gt;
   &lt;play:homePage&gt;http://www.w3.org/People/Lassila/&lt;/play:homepage&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      Here, the property <em>play:namelang</em> is defined to mean
      "A has a name which is in natural language B". In the
      iCalendar spec, the definition more complex in that the
      <em>lang</em> property is in same cases the language of a
      name and in other cases that of the object's description.
      This is a modeling muddle. The nice thing about doing it this
      way is that the structure is kept flat, and pre-XML systems
      such as RFC822 (email etc) headers have a syntax which can
      only cope with this.
    </p>
    <p>
      There are many drawbacks to this muddle. Ora may have two
      names, one in Finish and another in English, and the model
      fails to be able to express that. Because the attribute is
      apparently tied to the person and not obviously attached to
      the name, automatic processing of such a thing is ruled out.
      Clearly, the structure does not reflect the facts of the
      case.
    </p>
    <h3>
      <a name="Attempt1" id="Attempt1">Attempt 2: a property of the
      string?</a>
    </h3>
    <p>
      The second attempt is to make a graph which expresses the
      language as a property of the string itself. Clearly, "Ora
      Yrjö Uolevi Lassila" is Finnish, is it not? Yes, Ora is
      Finnish, but that is different. What we need to say is that
      the string is in the Finnish language. The problem, then,
      becomes that RDF does not allow literal text to be the
      subject of a statement. Never mind, RDF in fact invents the
      <em>rdf:value</em> property which allows us to specify that a
      node is really text, but say other things about it too. This
      is done by introducing an intermediate node.
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;rdf:type resource="http://www.people.org/types#person" /&gt;
   &lt;play:name rdf:parseType="Resource"&gt;
       &lt;rdf:value&gt;Ora Yrjö Uolevi Lassila&lt;/rdf:value&gt;
       &lt;play:lang&gt;fi&lt;/play:lang&gt;
    &lt;/play:name&gt;
   &lt;play:mailbox resource="mailto:ora.lassila@research.nokia.com"/&gt;
   &lt;play:homePage resource="http://www.w3.org/People/Lassila"&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      There we have it, and in an RDF graph at least very pretty it
      looks. And indeed, we could work with this, apart from the
      fact that we have made another modeling error. It is not true
      that the language is a property of the text string. After
      all, the string "Tim" - is that English (short for Timothy?
      or French (short for "Timothé")? I don't need to add a
      long list of text strings which can be interpreted as one
      language or as another. A system which made the assertion
      that the string itself was fundamentally English would simply
      be not representing the case.
    </p>
    <h3>
      <a name="Attempt" id="Attempt">Attempt 3: a relationship
      between them.</a>
    </h3>
    <p>
      In fact, the situation is that Ora's name is a natural
      language object, which is the interpretation according to
      Finnish of the string "Ora Yrjö Uolevi Lassila". In
      other words, Finish the language is the relationship between
      Ora's name and the string. In RDF, we model a binary
      relationship with a property.
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;rdf:type&gt;http://www.people.org/types#person&lt;/a&gt;
   &lt;play:name&gt;
       &lt;lang:fi&gt;Ora Yrjö Uolevi Lassila&lt;/lang:fi&gt;
    &lt;/play:name&gt;
   &lt;play:mailbox&gt;ora.lassila@research.nokia.com&lt;/play:mailbox&gt;
   &lt;play:homePage&gt;http://www.w3.org/People/Lassila/&lt;/play:homepage&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      This works much better. Ora has a name which is the Finnish
      "Ora". This allows an RDF system to create a node for that
      string, and a "Finish" link from the concept of Ora the
      person, maybe a Danish link from the concept of the currency,
      and an old english link from the concept of weight (1/15
      pound), not to mention a Latin link from the concept of the
      shore.
    </p>
    <p>
      A problem we may feel is we would like the language to be a
      string, so that we can reference the ISO spec for all such
      things, but there is of course no reason why the spec for the
      lang: space should not reference the same spec.
    </p>
    <p>
      Another problem we might feel is that it is reasonable for
      the play:name to expect a string, and in most cases it may
      get a string: what is the poor system supposed to do in order
      to accommodate finding a natural language object in place of
      a string? I guess making a class which includes all strings
      and all natural language objects is the best way to go. Any
      use of string which did not allow also such natural language
      object makes life much more difficult for multilingual
      software- so this is serious problem.
    </p>
    <p>
      <em>[[This leads us on to another interesting question of
      packaging in RDF. There is a requirement in XML packaging and
      in email packaging and it seems quite similarly in RDF that
      when you ask me for something of type X I must be able to
      give you something of type package which happens to include
      the X you asked for and also some information for your
      edification. But that is another story.@@@ eleborate and
      define properties or syntax@@@]]</em>
    </p>
    <p>
      What is really important is that we are using the ability of
      RDF to talk about abstract things, just as when we identified
      people by the resources they were associated with, but
      avoided pretending that any person had a definitive URI.
    </p>
    <h2 id="Interpreta1">
      Datatypes as interpretation properties<sup><a href="#L380" name="L382" id="L382">*</a></sup>
    </h2>
    <p>
      <em>Datatypes</em> here I mean in the sense of the atomic
      types in a programming language, or for example XML Datatypes
      (XML schema part 2). Defining datatypes involves defining
      constraints on an input string (for example specifying what a
      valid date is as a regular expression) and specifying the
      mathematical abstract individuals which instances of a type
      represent. One can model the relationship between the
      representation and the abstract value and the string using a
      property.
    </p>
    <table border="0" width="100%">
      <tbody>
        <tr>
          <td valign="middle">
            <pre>&lt;rdf:Description about="#myshoe"&gt;
   &lt;shoe:size&gt;10&lt;/shoe:size&gt;
&lt;/rdf:Description&gt;
</pre>
          </td>
          <td valign="middle">
            <span class="N3">&lt;#myshoe&gt; shoe:size "10".</span>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      This doesn't tell us what it is 10 of. We could go through
      life without any model of types: we could define a shoe size
      as being a decimal string for a number inches. There are many
      questions and tradeoffs which datatype designers make (for
      example,
    </p>
    <ul>
      <li>Can you tell the type of a value from the string
      representation in every case? (eg 1.4e4 vs 1.4d4 for
      precision)
      </li>
      <li>Are the values of different datatypes distinct? (Eg, is 1
      = 1.0?)
      </li>
      <li>Are the set of datatypes extensible? (Eg, can you add
      complex numbers or prime numbers?)
      </li>
      <li>Does representation equality imply value equality?
      </li>
      <li>Does value equality imply representation equality? (Is
      the only allowed representation the canonical one?)
      </li>
    </ul>
    <p>
      It would be nice to be able to model these questions in
      general in the semantic web, in order describe the properties
      of dat in arbitrary systems. We can introduce interpretation
      properties which link a string to its decimal interpretation
      as number, or a length including units. The problem is that
      the RDF graph which most folks use is the one above. The
      object of shoe:size is "10".
    </p>
    <p>
      The simplistic system corresponding exactly to the <a href="#Attempt2">Attempt 1 above</a>, is to declare that shoe:size
      is of class integer. This implies (we then say) that any
      value is a decimal string. Given the string and the type we
      can conclude the abstract value, the integer ten. This works.
      It is the system used by XML datatytpes whose answers for the
      questions above are as I understand it [No, Yes, Yes, Yes,
      No]. A snag is that you can't compare two values unless you
      know the datatypes.
    </p>
    <p>
      To model the representation explicitly in the RDF it seems
      you have to introduce another node and arc, which is a pain.
    </p>
    <table border="0" width="100%">
      <tbody>
        <tr>
          <td valign="middle">
            <pre>&lt;rdf:Description about="#myshoe"&gt;
   &lt;shoe:size&gt;
      &lt;rdf:value&gt;10&lt;/rdf:value&gt;
   &lt;/shoe:size&gt;
&lt;/rdf:Description&gt;
</pre>
          </td>
          <td valign="middle">
            <span class="N3">&lt;#myshoe&gt; shoe:size [ rdf:value
            "10" ].</span>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      We can then define rdf:value to express that there is some
      datatype relation which relates the size of the shoe to "10".
      All datatype relations are subProperties of rdf:value with
      this system. Once it is that form, the datatype information
      can be added to the graph. You have the choice of asserting
      that the object is of a given class, and deducing that the
      datatype relation must be a certain one. You can nest
      interpretation properties - interpreting a string as a
      decimal and then as a length in feet. But this is not
      possible without that extra node. One wonders about radically
      changing the way all RDF is parsed into triples, so as to
      introduce the extra abstract node for every literal --
      frightful. One wonders about declaring "10" to be a generic
      resource, an abstraction associated with the set of all
      things for which "10" is a representation under some datatype
      relation. This is frightful too you don't have "equals" any
      more in the sense you used to have it.
    </p>
    <p>
      Instead of adding an extra arc in series with the original,
      we can leave all Properties such as shoe:size as being rather
      vague relations between the shoe and some string
      representation, and then using a functional property (say
      <code>rdf:actual)</code> to relate the shoe:size to a (more
      useful) property whose object is a typed abstract value.
    </p>
    <pre>{ &lt;#myshoe&gt; shoe:size "10" } log:implies
{ &lt;#myshoe&gt; [is rdf:actual of shoe:size] [rdf:value "10"] } .
</pre>
    <p>
      <em>@@@ No clear way forward for describing datatypes in
      RDF/DAML (2001/1) @@</em>
    </p>
    <h2>
      <a name="More" id="More">More examples</a>
    </h2>
    <p>
      Interpretation properties was the name I have arbitrarily
      chosen for this sort of use. I am not sure whether it is a
      good word. But I want to encourage their use. Base 64
      encoding is another example. It comes up everywhere, but XML
      Digital Signature is one place.
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;play:name parseType="Resource"&gt;
      &lt;lang:fi  parseType="Resource"&gt;
        &lt;enc:base64&gt;jksdfhher78f8e47fy87eysady87f7sea&lt;/enc:base64&gt;
      &lt;/lang:fi&gt;
    &lt;/play:name&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      Another example is type coercion. Suppose there is a need to
      take something of datetime and use it as a date:
    </p>
    <pre>&lt;rdf:description&gt;
   &lt;play:event parseType="Resource"&gt;
       &lt;play:start parseType="Resource"&gt;
          &lt;play:date&gt;2000-01-31 12:00ET&lt;/play:date&gt;
       &lt;/play:start&gt;
       &lt;play:sumary&gt;The Bryn Poeth Uchaf Folk festival&lt;/play:summary&gt;
   &lt;/play:event&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      Such properties often have uniqueness and/or unambiguity
      properties. <em>enc:base64</em> for example is clearly a
      reversible transformation. It it relates two strings, on
      printable and the other a byte string with no other
      constraints. The byte string could not in general be
      represented in an XML document. The definition of
      <em>enc:base64</em> is that A when encoded in base 64 yields
      A. This allows any processor, given B to derive A. The
      specification of the encoding namespace (here refereed to by
      prefix <em>enc:</em>) could be that any conforming processor
      must be able to accept a base64 encoding of a string in any
      place that a string is acceptable.
    </p>
    <p>
      Interpretation properties make it clear what is going on. For
      example,
    </p>
    <pre>&lt;rdf:description about="http://www.w3.org/"&gt;
   &lt;play:xml-cannonicalized parseType="Resource"&gt;
      &lt;enc:hash-sha-1 parseType="Resource"&gt;
         &lt;enc:base64&gt;jd8734djr08347jyd4&lt;/enc:base64&gt;
      &lt;/enc:hash-sha-1&gt;
   &lt;/play:xml-cannonicalized&gt;
&lt;/rdf:description&gt;
</pre>
    <p>
      clearly makes a statement, using properties quite
      independently defined for the various processes, that the
      base64 encoding of the SHA-1 hash of the canonicalized form
      of the W3C home page is jd8734djr08347jyd4. Compare this
      withe the HTTP situation in which the headers cannot be
      nested, and the encodings and compression and other things
      applied to the body are mentioned as unordered annotations,
      and the spec has to provide a way of making the right
      conclusion about which happened in what order.
    </p>
    <h2>
      Units of Measure (2006)
    </h2>
    <p>
      This pattern applies very well to units of measure.
    </p>
    <p>
      See, for example a simple ontology <a href="http://www.w3.org/2007/ont/unit">http://www.w3.org/2007/ont/unit</a>
      of units of measure.
    </p>
    <h2>
      <a name="Conclusion" id="Conclusion">Conclusion</a>
    </h2>
    <p>
      Representing the interpretation of one string as an abstract
      thing can be done easily with RDF properties. This helps make
      a clean accurate model. However, using the concept for
      datatypes in RDF is incompatible with RDF as we know it
      today.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Apr 1997 00:00:00 GMT</pubDate>
  <title>Links and Laws - what does a hypertext link imply?</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/LinkLaw.html</link>
    <guid>https://www.w3.org/DesignIssues/LinkLaw.html</guid>
      <description><![CDATA[
    <h1>
      Links and Law
    </h1>
    <h3>
      <i>Preface</i>
    </h3>
    <p>
      This personal note I have put into the set of web
      architectural notes as it expresses fundamental
      understandings upon which the practical use and power of the
      web rest.
    </p>
    <p>
      The questions addressed are about the relationship of the
      hypertext forms of <i>linked</i> and <i>embedded</i> material
      to the social concepts involved such as attribution,
      endorsement, and ownership of information.
    </p>
    <p>
      Links in hypertext are new in that they can be followed
      automatically, but the concepts of reference and inclusion of
      material predate paper. There should not therefore be much
      confusion about what links imply, but as there have been some
      strange suggestions recently which would seriously damage the
      web, I write this note.
    </p>
    <h3>
      <a name="Abstract" id="Abstract">Abstract</a>
    </h3>
    <p>
      Normal hypertext links do not of themselves imply that the
      document linked to is part of, is endorsed by, or endorses,
      or has related ownership or distribution terms as the
      document linked from. However, embedding material by
      reference (sometimes called an embedding form of hypertext
      link) causes the embedded material to become a part of the
      embedding document.
    </p>
    <h2>
      <a name="sorts" id="sorts">Two sorts of link</a>
    </h2>
    <p>
      Basic HTML has three ways of linking to other material on the
      web: the hypertext link from an anchor (HTML "A" element),
      the general link with no specific source anchor within the
      document (HTML "LINK" element) and embedded objects and
      images (IMG and OBJECT). Let's call A and LINK
      "<b>normal</b>" links as they are visible to the user as a
      traversal between two documents. We'll call the thing between
      a document and an embedded image or object or subdocument
      "<b>embedding</b>" links.
    </p>
    <p>
      This distinction is an old one in hypertext. Some systems
      such Peter Brown's original "Guide" worked only by expanding
      links inline, and some (such as HTML before the IMG tag was
      introduced) worked only with normal links.
    </p>
    <h2>
      <a name="Normal" id="Normal">Normal Links</a>
    </h2>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            <b>The intention in the design of the web was that
            normal links should simply be references, with no
            implied meaning.</b>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      A normal hypertext link does NOT necessarily imply that
    </p>
    <ul>
      <li>One document endorses the other; or that
      </li>
      <li>One document is created by the same person as the other,
      or that
      </li>
      <li>One document is to be considered part of another.
      </li>
    </ul>
    <p>
      Typically when the user of a graphical window-oriented Web
      browser follows a normal link, a new window is created and
      the linked document is displayed in it, or the old document
      is deleted from its window and the linked document displayed
      in its place. The window system has a user interface metaphor
      that things in different windows are different objects.
    </p>
    <h3>
      Meaning in content
    </h3>
    <p>
      So the existence of the link itself does not carry meaning.
      Of course the contents of the linking document can carry
      meaning, and often does. So, if one writes "See Fred's web
      pages (link) which are way cool" that is clearly some kind of
      endorsement. If one writes "We go into this in more detail on
      our sales brochure (link)" there is an implication of common
      authorship. If one writes "Fred's message (link) was written
      out of malice and is a downright lie" one is denigrating
      (possibly libellously) the linked document. So the content of
      hypertext documents carry meaning often about the linked
      document, and one should be responsible about this. In fact,
      clarifying the relative status of the linked document is
      often helpful to the reader.
    </p>
    <h2>
      <a name="Embedded" id="Embedded">Embedded Material</a>
    </h2>
    <p>
      The relationship between a document and an image embedded in
      that document is quite different from normal link. (In some
      designs it is still refered to as a sort of link).
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            <b>Images, embedded objects, and background sounds and
            images are by default to be considered part of the
            document.</b>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      If I say, "To understand this you only have to read this
      article", or "This is the agreement between us", I am talking
      about a particular document. It is important that we have a
      clear picture of what is part of that document and what
      isn't. Embedded images clearly are part of the embedding
      document. The author of a document has responsibility for the
      content, even if the images he or she includes are from
      another web site.
    </p>
    <p>
      (There are issues of expectations to be set about
      availability and security from corruption of remote material,
      but I do not address these here. Here I just emphasize is
      that embedded images should be considered part of a document,
      but documents connected by a normal link should be regarded
      as separate documents.)
    </p>
    <p>
      We compose documents out of parts, and the finished work
      comprises contributions from the parts and also from the
      arrangement. It is very important that we can include remote
      parts by reference without having to make a separate local
      copy. When an embedded image (or sound) is included by
      reference to its original address (URI) this allows an
      inquirer to know that address, and hence know the current
      version of the image. It allows the owner of the image to to
      a certain extent to know and possibly to control who has
      access to that image. Also I expect in that in the future it
      will allow one to find out the owner and licence terms for
      distribution of that image, which is important for
      intellectual property rights to be respected on the Web.
    </p>
    <h4>
      Explict distinction
    </h4>
    <p>
      Advertising provides an exception to this rule: a case in
      which the embedded image is <b>not</b> part of the document.
      &nbsp;At risk of making ittoo easy for users to turn
      &nbsp;off advertizing, it would be ideal if the distinction
      were make in the markup between embeeded information which is
      or is not part of the document. &nbsp;This would allow, for
      example, a border to be places around an advertizement to
      allow the user to realize that it does not come from the same
      source as the text. &nbsp;I personally feel that this would
      be an important step forward in the integrity &nbsp;of the
      web. A flag like
    </p>
    <pre>&lt;IMG src="banner-ad.gif" foreign&gt;
 
</pre>
    <p>
      would be fine.
    </p>
    <h2>
      <a name="User" id="User">User Interface</a>
    </h2>
    <p>
      When Web documents are presented to people, most current
      browsers (1997) make a clear distinction between embedded
      images, which are presented in the same window as the
      embedding document at the same time, and linked documents
      which never are. The window system's concept of a "Window" is
      used to convey when things are part of the same document. It
      is important for many reasons, some of which were mentioned
      above, that user interfaces continue to make this
      distinction.
    </p>
    <h4>
      Frames
    </h4>
    <p>
      The "frames" of HTML unfortunately provide an interface which
      is less clear. The parts of the document do appear with the
      same window, but because within a single frame (subsection of
      a window) one can follow hypertext links replacing content
      with a separate document, it is easy to create the impression
      that the owner of the surrounding frames is in fact
      responsible for the defining document. It is possible that
      work by the HTML community can produce explict markup (such
      as the "foreign" flag above) for conveying, when frames are
      used, which parts of the screen are considered to be the same
      document. In the mean time, it is appropriate for content
      providers so make efforts to ensure by the design of (and/or
      statements on) their web pages that users are not left with
      the illusion that information within an embedded frame is
      part of their document when it is really not.
    </p>
    <p>
      <i>Next: Some dangerous <a href="LinkMyths.html"><b>Myths</b>
      about Links</a></i>
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Apr 1997 00:00:00 GMT</pubDate>
  <title>Myths about Links</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/LinkMyths.html</link>
    <guid>https://www.w3.org/DesignIssues/LinkMyths.html</guid>
      <description><![CDATA[
    <h1>
      Links and Law: Myths
    </h1>
    <p>
      See <a href="LinkLaw">Links and Law</a> before reading this.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 27 Jul 2006 00:00:00 GMT</pubDate>
  <title>Linked Data</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/LinkedData.html</link>
    <guid>https://www.w3.org/DesignIssues/LinkedData.html</guid>
      <description><![CDATA[
   http://www.cafepress.co.uk/w3c_shop.480759174  http://www.cafepress.com/+shirt,480756337 
   <a href="http://www.cafepress.com/w3c_shop"><img alt="Get a 5* mug" border="none" src="diagrams/lod/597992118v2_350x350_Back.jpg" align="right"></a>

  <h1>Linked Data</h1>

  <p>The Semantic Web isn't just about putting data on the web. It
  is about making links, so that a person or machine can explore
  the web of data. &nbsp;With linked data, when you have some of
  it, you can find other, related, data.</p>

  <p>Like the web of hypertext, the web of data is constructed with
  documents on the web. However, &nbsp;unlike the web of hypertext,
  &nbsp;where&nbsp;links are relationships&nbsp;anchors in
  hypertext documents written in <small>HTML</small>, for data they
  links&nbsp; between arbitrary things described by
  <small>RDF</small>,. &nbsp;The <small>URI</small>s identify any
  kind of object or&nbsp; concept. &nbsp; But for
  <small>HTML</small> or <small>RDF</small>, the same expectations
  apply to make the web grow:</p>

  <ol>
    <li>
      <p>Use <small>URI</small>s as names for things</p>
    </li>

    <li>
      <p>Use <small>HTTP</small> <small>URI</small>s so that people
      can look up those names.</p>
    </li>

    <li>
      <p>When someone looks up a <small>URI</small>, provide useful
      information, using the standards (RDF*, SPARQL)</p>
    </li>

    <li>
      <p>Include links to other <small>URIs</small>. so that they
      can discover more things.</p>
    </li>
  </ol>

  <p>Simple. &nbsp;In fact, though, a surprising amount of data
  isn't linked in 2006, because of problems with one or more of the
  steps. &nbsp;This article discusses solutions to these problems,
  details of implementation, and factors affecting choices about
  how you publish your data.</p>

  <h2>The four rules</h2>

  <p>I'll refer to the steps above as rules, but they are
  expectations of behavior. &nbsp;Breaking them does not destroy
  anything, but misses an opportunity to make&nbsp; data
  interconnected. &nbsp;This in turn limits the ways it can later
  be reused in unexpected ways. &nbsp;It is the unexpected re-use
  of information which&nbsp;is the value added by the web.</p>

  <p>The first rule, to identify things with
  <small>URI</small>s,&nbsp; is pretty much understood by most
  people doing semantic web technology. &nbsp;If it doesn't use the
  universal <small>URI</small> set of symbols, we don't call it
  Semantic Web.<br>
  <br>
  The second rule, to use <small>HTTP</small>
  <small>URI</small>s,&nbsp; is also widely understood. &nbsp;The
  only deviation has been, since the web started,&nbsp; a constant
  tendency for people to invent new <small>URI</small> schemes (and
  sub-schemes within the <span style="font-family: monospace;">urn:</span> scheme)&nbsp; such as
  <small>LSID</small>s and handles and <small>XRI</small>s and
  <small>DOI</small>s and so on, for various reasons.
  &nbsp;Typically, these involve not wanting&nbsp;to commit to the
  established Domain Name System (<small>DNS</small>) for
  delegation of authority but to construct something under separate
  control. &nbsp; Sometimes it has to do with not understanding
  that <small>HTTP</small> <small>URI</small>s are names (not
  addresses) and that <small>HTTP</small> name lookup is a complex,
  powerful and evolving set of standards. This issue discussed at
  length elsewhere, and time does not allow us to delve into it
  here. [ @@ref TAG finding, etc])</p>

  <p>The third rule, that one should serve information on the web
  against a <small>URI</small>, is, in 2006, well followed for most
  ontologies, but, for some reason, not for some major datasets.
  &nbsp;One can, &nbsp;in general,&nbsp; look up the properties and
  classes one finds in data, and get information from the
  <small>RDF</small>, <small>RDFS</small>, and <small>OWL</small>
  ontologies including the relationships between the terms in the
  ontology.</p>

  <p>The basic format here for RDF/XML, with its popular
  alternative serialization N3 (or Turtle). Large datasets provide
  a SPARQL query service, but the basic linked data should br
  provided as well.</p>

  <p>Many research and evaluation projects in the few years of the
  Semantic Web technologies produced ontologies, and significant
  data stores, but the data, if available at all, is buried in a
  zip archive somewhere, rather than being accessible on the web as
  linked data. &nbsp;The Biopax project, the CSAktive data on
  computer science research people and projects were two examples.
  [The CSAktive data is now (2007) available as linked data]</p>

  <p>There is also a large and increasing amount of
  <small>URI</small>s of non-ontology data which can be looked up.
  &nbsp;<a href="http://ontoworld.org/wiki/Semantic_wiki">Semantic
  wikis</a> are one example. The "Friend of a friend"
  (<small>FOAF</small>) and&nbsp;<span style="font-style: italic;">Description of a Project</span>
  (<small>DOAP</small>) ontologies are used to build social
  networks across the web.&nbsp; &nbsp; Typical <a href="http://en.wikipedia.org/wiki/List_of_social_networking_websites">
  social network portals</a> do not provide links to other sites,
  nor expose their data in a standard form.</p>

  <p>LiveJournal and Opera Community are two portal web sites which
  do in fact publish their data in <small>RDF</small> on the web.
  &nbsp; (Plaxo has a trail scheme, and I'm not sure
  whether&nbsp;they support <span style="font-style: italic;">knows</span> links). This means that I can
  write in my <small>FOAF</small> file that I know Håkon Lie by
  using his <small>URI</small> in the Opera Community data, and a
  person or machine browsing that data can then follow that link
  and find all his friends. <i>[Update:]</i> Also, the Opera
  Community site allows you to register the RDF URI for yourelf on
  another site. This means that public data about you from
  different sites can be linked together into one web, and a person
  or machine starting with your Opera identity can find the others.
  <!--
      &nbsp;
      Well, all of his friends? &nbsp;Not really: &nbsp;only his
      friends who are in the Opera Community. &nbsp;The system
      doesn't yet him store the <small>URI</small>s of people on
      different systems. So while the social network is open to
      incoming links, and while it is internally browseable, it
      doesn't make outgoing links.
      --></p>

  <p>The fourth rule, to make links elsewhere,&nbsp; is necessary
  to connect the data we have into a web, a serious, unbounded web
  in which one can find al kinds of things, &nbsp;just as on the
  hypertext web we have managed to build.</p>

  <p>In hypertext web sites it is considered generally rather bad
  etiquette not to link to related external material. &nbsp;The
  value of your own information is very much a function of what it
  links to, as well as the inherent value of the information within
  the web page. &nbsp;So it is also in the Semantic Web.</p>

  <p>So let's look at the ways of linking data, starting with the
  simplest way of making a link.</p>

  <h3>Basic web look-up</h3>

  <p>The simplest way to make linked data is to use, in one file, a
  <small>URI</small> which points into another.</p>

  <p>When you write an <small>RDF</small> file, &nbsp; say
  &lt;http://example.org/smith&gt;, then you can use local
  identifiers within the file, say &nbsp;#albert, #brian and
  #carol. &nbsp;In N3 you might say</p>
  <pre>&lt;#albert&gt;  fam:child &lt;#brian&gt;, &lt;#carol&gt;.
</pre>

  <p>or in <small>RDF/XML</small></p>
  <pre>&lt;rdf:Description about="#albert"<br> &lt;fam:child rdf:Resource="#brian"&gt;<br>  &lt;fam:child rdf:Resource="#carol"&gt;<br>&lt;/rdf:Description&gt;
</pre>

  <p>The <small>WWW</small> architecture now gives a global
  identifier &nbsp;"http://example.org/smith#albert" to Albert.
  &nbsp;This is a valuable thing to do, as anyone on the planet can
  now use that global identifier to refer to Albert and give more
  information.&nbsp;</p>

  <p>For example, in the
  document&nbsp;&lt;http://example.org/jones&gt; someone might
  write:</p>
  <pre>&lt;#denise&gt;  fam:child &lt;#edwin&gt;, &lt;smith#carol&gt;.
</pre>

  <p>or in <small>RDF/XML</small></p>
  <pre>&lt;rdf:Description about="#denise"<br> &lt;fam:child rdf:Resource="#edwin"&gt;<br>  &lt;fam:child rdf:Resource="http://example.org/smith#carol"&gt;<br>&lt;/rdf:Description&gt;
</pre>

  <p><br>
  Clearly it is reasonable for anyone who comes across the
  identifier 'http://example.org/smith#carol" to:</p>

  <ol>
    <li>Form the <small>URI</small> of the document by truncating
    before the hash</li>

    <li>Access the document to obtain information about #carol</li>
  </ol>

  <p>We call this dereferencing the <small>URI</small>. &nbsp;This
  is basic semantic web.&nbsp;</p>

  <p>There are several variations.</p>

  <h3>Variation: URIs without Slashes and HTTP 303</h3>

  <p>There are some circumstances in which dividing identifiers
  into documents doesn't work very well. &nbsp; There may logically
  be one global symbol per document per document, and there is a
  reluctance to include a # in the <small>URI</small> such
  as&nbsp;</p>

  <p>
  http://wordnet.example.net/antidisesablishmentarianism#word</p>Historically,
  the early Dublin Core and <small>FOAF</small> vocabularies did
  not have # in their URIs.   In any event when
  <small>HTTP</small> <small>URI</small>s without hashes are used
  for abstract concepts, and there is a document that carries
  information about them, then:<br>

  <ol>
    <li>An <small>HTTP</small> <small>GET</small>&nbsp; request on
    the <small>URI</small> of the concept returns <span style="font-family: monospace;">303 See Also</span> and gives in the
    Location: header, the <small>URI</small> of the
    document.&nbsp;&nbsp;</li>

    <li>The document is retrieved as normal</li>
  </ol>

  <p>This method has the advantage that <small>URI</small>s can be
  made up of all forms. &nbsp;It has the disadvantage that an
  <small>HTTP</small> request mBrowse-ableust be made for every
  single one. &nbsp;In the case of Dublin Core, for example,
  dc:title and dc:creator etc are in fact served by the same
  ontology document, but &nbsp;one does not know until they have
  each been fetched and returned HTTP redirections.</p>

  <h3>Variation: FOAF and rdfs:seeAlso</h3>

  <p>The <a href="http://foaf-project.org/">Friend-Of-A-Friend</a>&nbsp;convention
  uses a form of data link, but&nbsp; not using either of the two
  forms mentioned above. &nbsp;To refer to another person in a
  <small>FOAF</small> file, the convention was to give two
  properties, one pointing to the document they are described in,
  and the other for identifying them within that document.</p>
  <pre>&lt;#i&gt;  foaf:knows  [<br>       foaf:mbox &lt;mailto:joe@example.com&gt;;<br>        rdfs:seeAlso &lt;http://example.com/foaf/joe&gt; ].
</pre>

  <p>Read, "I know that which has email&nbsp; joe@example.com and
  about which more information is in
  &lt;http://example.com/foafjoe&gt;".</p>

  <p>In fact, for privacy, often people don't put their email
  addresses on the web directly, but in fact put a one-way hash
  (<small>SHA-1</small>) of their email address and give that. This
  clever trick allows people who know their email address already
  to work out that it is the same person, without giving the email
  away to others.</p>
  <pre>&lt;#i&gt;  foaf:knows  [<br>       foaf:mbox_sha1sum "2738167846123764823647";  # @@ dummy<br>  rdfs:seeAslo &lt;http://example.com/foaf/joe&gt; ].
</pre>

  <p>This linking system was very successful, forming a
  &nbsp;growing social network, and dominating, in 2006, the linked
  data available on the web.</p>

  <p>However, the system has the snag that it does not give
  <small>URI</small>s to people, and so basic links to them cannot
  be made.</p>

  <p>I&nbsp; recommend (e.g in weblogs on <a href="http://dig.csail.mit.edu/breadcrumbs/node/62">Links on the
  Semantic Web</a> , <a href="http://dig.csail.mit.edu/breadcrumbs/node/71">Give yourself a
  URI</a>, and and <a href="http://dig.csail.mit.edu/breadcrumbs/node/72">Backward and
  Forward links in RDF just as important</a>) that those making a
  <small>FOAF</small> file give themselves a <small>URI</small> as
  well as using the <small>FOAF</small> convention.&nbsp; &nbsp;
  &nbsp;Similarly, when you refer to a <small>FOAF</small>
  &nbsp;file which gives &nbsp;a <small>URI</small> to a person,
  use it in your reference to that person, so that clients which
  just use <small>URI</small>s and don't know about the
  <small>FOAF</small> convention can follow the link.</p>

  <h2><a id="browsable" name="browsable">Browsable
  graphs</a></h2>So now we have looked at ways of making a link,
  let's look at the  choices of when to make a link.<br>

  <p>One important pattern is a set of data which you can explore
  as you go link by link by fetching data. &nbsp; Whenever one
  looks up the URI for a node in the RDF graph, the server returns
  information about the arcs out of that node, and the arcs in.
  &nbsp;In other words, it returns any RDF statements in which the
  term appears as either subject or object.</p>

  <p>Formally,&nbsp; call a&nbsp;graph G <span style="font-style: italic;">browsable</span> if, for &nbsp;the URI of
  any node in G, if I look up that URI I will be returned
  information which describes the node, where describing a node
  means:</p>

  <ol>
    <li>Returning all statements where the node is a subject or
    object; and</li>

    <li>Describing all blank nodes attached to the node by one
    arc.</li>
  </ol><br>

  <p class="detail">(The subgraph returned has been referred to as
  "minimum Spanning Graph (MSG [@@ref] ) or &nbsp;RDF molecule
  [@@ref], depending on whether nodes are considered identified if
  they can be expressed as a path of function, or reverse inverse
  functional properties. A concise bounded description, which only
  follows links from subject to object, &nbsp;does not work.)</p>

  <p>In practice, when data is stored in two documents, this means
  that any <small>RDF</small> statements which relate things in the
  two files must be repeated in each. &nbsp;So, for example, in my
  <small>FOAF</small> page I mention that I am a member of the
  <small>DIG</small> group, and that information is repeated on the
  <small>DIG</small> group data. Thus, someone starting from the
  concept of the group can also find out that I am a member.
  &nbsp;In fact, someone who starts off with my <small>URI</small>
  can find all the people who are in the same group.</p>

  <h3>Limitations on browseable data</h3>

  <p>So statements which relate things in the two documents must be
  repeated in each. This clearly is against the first rule of data
  storage: don't store the same data in two different places: you
  will have problems keeping it consistent. &nbsp;This is indeed an
  issue with browsable data. &nbsp; A set of &nbsp;of completely
  browsable data with links in both directions has to be completely
  consistent, and that takes coordination, especially if different
  authors or different programs are involved.</p>

  <p>We can have completely browsable data, however, where it is
  automatically generated. &nbsp;The <a href="http://dig.csail.mit.edu/2006/dbview/dbview.py">dbview</a>
  &nbsp;server, for example,&nbsp; provides a browsable virtual
  &nbsp;documents containing the data from any arbitrary relational
  database.</p>

  <p>When we have a data from multiple sources, then we have
  compromises. &nbsp;These are often settled by common sense,
  asking the question,</p>

  <blockquote>
    <p>"If someone has the URI of that thing, what relationships to
    what other objects is it useful to know about?"</p>
  </blockquote>

  <p>Sometimes, social questions &nbsp;determine the answer.
  &nbsp;I have links in my <small>FOAF</small> file that I know
  various people. &nbsp;They don't generally repeat that
  information in their <small>FOAF</small> files. Someone may say
  that they know me, which is an assertion which, in the
  <small>FOAF</small> convention, is theirs to assert, and the
  reader's to trust or not. &nbsp;</p>

  <p>Other times, the number of arcs makes it impractical. &nbsp; A
  <small>GPS</small> track gives thousands of times at which my
  latitude, longitude are known. Every person loading my
  <small>FOAF</small> file can expect to get my business card
  information, but not all those trackpoints. It is reasonable to
  have a pointer from the track (or even each point) to the person
  whose position is represented, but not the other way.&nbsp;</p>

  <p>One pattern is to have links of a certain property in a
  separate document. &nbsp; A person's homepage doesn't list all
  their publications, but instead puts a link to it a separate
  document listing them. &nbsp;There is an understanding
  that&nbsp;<span style="font-family: monospace;">foaf:made</span>
  gives a work of some sort, but <span style="font-family: monospace;">foaf:pubs</span> points to a document
  giving a list of works. &nbsp;Thus, someone searching for
  something <span style="font-family: monospace;">foaf:made</span>
  link would do well to follow a <span style="font-family: monospace;">foaf:pubs</span> link.&nbsp; It might
  be useful to formalize the notion with a statement like</p>
  <pre>foaf:made  link:listDocumentProperty foaf:pubs.
</pre>

  <p>in one of&nbsp;the ontologies.</p>

  <h3>Query services</h3>

  <p>Sometimes the sheer volume of data makes serving it as lots of
  files possible, but cumbersome for efficient remote queries over
  the dataset. &nbsp;In this case, it seems reasonable to provide a
  <small>SPARQL</small> query service. &nbsp;To make the data be
  effectively linked, someone who only has the
  &nbsp;<small>URI</small> of something must be able to find their
  way the <small>SPARQL</small> endpoint.&nbsp;</p>

  <p>Here again the <small>HTTP</small> 303 response can be used,
  to refer the enquirer to a document with metadata about which
  query service endpoints can provide what information about which
  &nbsp;classes of <small>URI</small>s.</p>Vocabularies for doing
  this have not yet been standardized.<br>

  <h2><a id="fivestar" name="fivestar">Is your Linked Open Data 5
  Star?</a></h2>(Added 2010). This year, in order to encourage
  people -- especially government data owners -- along the road to
  good linked data, I have developped this star rating system.

  <p>Linked Data is defined above. Linked <em>Open</em> Data (LOD)
  is Linked Data which is released under an open licence, which
  does not impede its reuse for free. Creative Commons CC-BY is an
  example open licence, as is the UK's <a href="http://www.nationalarchives.gov.uk/doc/open-government-licence/">
  Open Government Licence</a>. Linked Data does not of course in
  general have to be open -- there is a lot of important use of
  lnked data internally, and for personal and group-wide data. You
  can have 5-star Linked Data without it being open. However, if it
  claims to be Linked Open Data then it does have to be open, to
  get any star at all.</p>Under the star scheme, you get one (big!)
  star if the information has been made public at all, even if it
  is a photo of a scan of a fax of a table -- if it has an open
  licence. The you get more stars as you make it progressively more
  powerful, easier for people to use.

  <table>
    <tbody><tr>
      <td class="stars">★</td>

      <td>Available on the web (whatever format) <i>but with an
      open licence, to be Open Data</i></td>
    </tr>

    <tr>
      <td class="stars">★★</td>

      <td>Available as machine-readable structured data (e.g. excel
      instead of image scan of a table)</td>
    </tr>

    <tr>
      <td class="stars">★★★</td>

      <td>as (2) plus non-proprietary format (e.g. CSV instead of
      excel)</td>
    </tr>

    <tr>
      <td class="stars">★★★★</td>

      <td>All the above plus, Use open standards from W3C (RDF and
      SPARQL) to identify things, so that people can point at your
      stuff</td>
    </tr>

    <tr>
      <td class="stars">★★★★★</td>

      <td>All the above, plus: Link your data to other people’s
      data to provide context</td>
    </tr>
  </tbody></table>

  <p>How well does your data do? You can buy <a href="http://www.cafepress.co.uk/w3c_shop.480759174">5 star data
  mugs</a>, T-shirts and bumper stickers from the W3C shop at
  cafepress: use them to get your colleages and fellows
  conference-goers thinking 5 star linked data. (Profits also help
  W3C :-).</p>

  <p>Now in 2010, people have been pressing me, for governmet data,
  to add a new requirement, and that is there should be metadata
  about the data itself, and that that metadata should be availble
  from a major catalog. Any open dataset (or even datasets which
  are not but should be open) can be regisetreed at ckan.net.
  Government datasets from the UK and US hsould be regisetred at
  data.gov.uk or data.gov respectively. Other copuntries I expect
  to develop their own registries. Yes, there should be metadata
  about your dataset. That may be the subject of a new note in this
  series.</p>

  <h2><a id="conclusion" name="conclusion">Conclusion</a></h2><br>

  <p>Linked data is essential to actually connect the semantic web.
  &nbsp;It is quite easy to do with a little thought, and becomes
  second nature. &nbsp; Various common sense considerations
  determine when to make a link and when not to.</p>

  <p>The <a href="http://dig.csail.mit.edu/2005/ajar/ajaw/tab">Tabulator</a>
  client (running in a suitable browser)&nbsp; allows you to browse
  linked data using the above conventions, and can be used to check
  that your linked data works.</p>

  <p>References</p>

  <p>[Ding2005] Li Ding, et. al.,&nbsp; <a href="http://ebiquity.umbc.edu/paper/html/id/240/"><span style="font-style: italic;">Tracking RDF Graph Provenance using RDF
  Molecules</span></a>, UMBC Tech Report TR-CS-05-06</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 20 Oct 2021 00:00:00 GMT</pubDate>
  <title>Live data</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Live.html</link>
    <guid>https://www.w3.org/DesignIssues/Live.html</guid>
      <description><![CDATA[When applications are built by sharing access-controlled
read-write linked data, it is useful for one application to be informed in
real time when another changes the data. By adding real-time publish/subscribe
(pub/sub) functionality to the architecture, the system can react in real time
without having to poll. The Solid protocol includes a basic
but effective form of this using WebSockets, where any app or part of an app
which is using data from a given resource can listen for changes to that resource.
In 2021, the live update protocol is just a web socket  'PING' notification that the
resource has changed, after which the client re-loads it. In future it would be good to
instead send a PATCH with the change that had happened, to reduce both the bandwidth
necessary and the number of network round trips between client and server. This will allow us to
connect to more complex distributed protocols such as Conflict-free Replicated Data Types
 (<a href="https://crdt.tech/">CRDT</a>s), and provide offline
and Local First functionality in future. But right now a simple WebSocket protocol
provides great user value, by allowing all kinds of apps to become live apps.
<p><a href="https://www.w3.org/DesignIssues/Live.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Jan 1999 00:00:00 GMT</pubDate>
  <title>Logic and the semantic web</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Logic.html</link>
    <guid>https://www.w3.org/DesignIssues/Logic.html</guid>
      <description><![CDATA[
    <h1>
      The Semantic Web as a language of logic
    </h1>
    <p>
      This looks at the Semantic Web design in the light a little
      reading on formal logic, of the Access Limited Logic system,
      in particular, and in the light of logical languages in
      general. A problem here is a that I am no logician, and so I
      am am having to step like a fascinated reporter into this
      world of which I do not possess intimate experience.
    </p>
    <h2>
      Introduction
    </h2>
    <p>
      The <a href="Toolbox.html">Semantic Web Toolbox</a> discusses
      the step from the web as being a repository of flat data
      without logic to a level at which it is possible to express
      logic. This is something which knowledge representation
      systems have been wary of.
    </p>
    <p>
      The Semantic Web has a different set of goals from most
      systems of logic. As Crawford and Kuipers put it in [<a href="#Crawf90">Crawf90</a>],
    </p>
    <blockquote>
      [...]a knowledge representation system must have the
      following properties:
      <ol>
        <li>It must have a reasonably compact syntax.
        </li>
        <li>It must have a well defined semantics so that one can
        say precisely what is being represented.
        </li>
        <li>It must have sufficient expressive power to represent
        human knowledge.
        </li>
        <li>It must have an efficient, powerful, and understandable
        reasoning mechanism
        </li>
        <li>It must be usable to build large knowledge bases.
        </li>
      </ol>
      <p>
        It has proved difficult, however, to achieve the third and
        fourth properties simultaneously.
      </p>
    </blockquote>
    <p>
      The semantic web goal is to be a unifying system which will
      (like the web for human communication) be as un-restraining
      as possible so that the complexity of reality can be
      described. Therefore item 3 becomes essential. This can be
      achieved by dropping 4 - or the parts of item 4 which
      conflict with 3, notably a single, efficient reasoning
      system. The idea is that, within the global semantic web,
      there will be a subset of the system which will be
      constrained in specific ways so as to achieve the
      tractability and efficiency which is no necessary in real
      applications. However, the semantic web itself will not
      define a reasoning engine. It will define valid operations,
      and will require consistency for them. On the semantic web in
      general, a party must be able to follow a proof of a theorem
      but is not expected to generate one.
    </p>
    <p>
      (This fundamental change goals from KR systems to the
      semantic web is loosely analogous with the goal change from
      conventional hypertext systems to the original Web design
      dropping link consistency in favor of expressive flexibility
      and scalability.The latter did not prevent individual web
      sites from having a strict hierarchical order or matrix
      structure, but it did not require it of the web as a whole.)
    </p>
    <p>
      If there is a <em>semantic web machine</em>, then it is a
      proof validator, not a theorem prover. It can't find answers,
      it can't even check that an answer is right, but it can
      follow a simple explanation that an answer is right. The
      Semantic Web as a source of data should be fodder for
      automated reasoning systems of many kinds, but it as such not
      a reasoning system.
    </p>
    <p>
      Most knowledge representation systems distinguish between
      inference "rules" and other believed information. In some
      cases, this is because the rules (such as substitution in a
      formula) cannot be written in the language - they are defined
      outside the language. In fact the set of rules used by the
      system is often not only formally quite redundant but
      arbitrary. However, a universal design such as the Semantic
      Web must be minimalist. We will ask all logical data on the
      web to be expressed directly or indirectly in terms of the
      semantic web - a strong demand - so we cannot constrain
      applications any further. Different machines which use data
      from the web will use different algorithms, different sets of
      inference rules. In some cases these will be powerful AI
      systems and in others they will be simple document conversion
      systems. The essential this is that the results of either
      must be provably correct against the same basic minimalist
      rules. In fact for interchange of proof, the set of rules is
      an engineering choice.
    </p>
    <p>
      There are many related ways in which subsystems can be
      created
    </p>
    <ul>
      <li>The semantic web language can be subsetted, by the
      removal of operations and axioms and rules;
      </li>
      <li>The set of statements may be limited to that from
      particular documents or web sites;
      </li>
      <li>The form of formulas used may be constrained, for example
      using document schemata;
      </li>
      <li>Application design decisions can be made so as to
      specifically guarantee tractable results using common
      reasoning engines.
      </li>
      <li>Proofs can be constructed by completely hand-built
      application-specific programs
      </li>
    </ul>
    <p>
      For example, Access Limited Logic is restricted (as I
      understand it) to relations r(a,b) available when r is
      accessed, and uses inference rules which only chain forward
      along such links. There is also a "partitioning" of the Web
      by making partitioning the rules in order to limit
      complexity.
    </p>
    <p>
      For the semantic web as a whole, then, we do require
      tractable
    </p>
    <ul>
      <li>Consistency, that it must not be possible to deduce a
      contradiction (without having been given one)
      </li>
      <li>Strength in that all applications must be subsets
      </li>
    </ul>
    <h3>
      Grounding in Reality
    </h3>
    <p>
      Philosophically, the semantic web produces more than a set of
      rules for manipulation of formulae. It defines documents on
      the Web has having a socially significant meaning. Therefore
      it is not simply sufficient to demonstrate that one can
      constrain the semantic web so as to make it isomorphic to a
      particular algebra of a given system, but one must ensure
      that a particular mapping is defined so that the web
      representation of that particular system conveys is semantics
      in a way that it can meaningfully be combined with the rest
      of the semantic web. Electronic commerce needs a solid
      foundation in this way, and the development of the semantic
      web is (in 1999) essential to provide a rigid framework in
      which to define electronic commerce terms, before electronic
      commerce expands as a mass of vaguely defined semantics and
      ad hoc syntax which leaves no room for automatic treatment,
      and in which the court of law rather than a logical
      derivation settles arguments.
    </p>
    <p>
      Practically, the meaning of semantic web data is grounded in
      non-semantic-web applications which are interfaced to the
      semantic web. For example, currency transfer or ecommerce
      applications, which accept semantic web input, define for
      practical purposes what the terms in the currency transfer
      instrument mean.
    </p>
    <h2>
      Axiomatic Basis
    </h2>
    <p>
      <em>@@I [DanC] think this section is outdated by recent
      thoughts [2002] on</em> <a href="#reasoning"><em>paradox and
      the excluded middle</em></a>
    </p>
    <p>
      To the level of first order logic, we don't really need to
      pick one set of axioms in that there are equivalent choices
      which lead to demonstrably the same results.
    </p>
    <p>
      (A cute one at the propositional logic level seems [Burris,
      p126] to be Nicod's set in which nand (in XML toolbox
      &lt;not&gt;..&lt;/not&gt; and below [xy]) is the Sheffer
      (sole) connective and the only rules of inference are
      substitution and the <em>modus ponens</em> equivelent that
      from F and [F[G H]] one can deduce H, and the single axiom
      [[P[QR]][[S[SS]][[UQ][[PU][PU]]]].)
    </p>
    <p>
      Let us assume the properties of first order logic here.
    </p>
    <p>
      If we add anything else we have to be careful that it should
      either be definable in terms of the first order set or that
      the resulting language is a subset of a well proven logical
      system -- or else we have a lot of work to do in establishing
      a new system!
    </p>
    <h2>
      Intractability and Undecidability
    </h2>
    <p>
      These are two goals to which we explicitly do not aspire in
      the Semantic Web in order to get in return expressive power.
      (We still require consistency!). The world is full of
      undecidable statements, and intractable problems. The
      semantic web has to give the power to express such things.
    </p>
    <p>
      Crawford and Kuipers The same explain in the introduction
      their Negation in ALL paper,
    </p>
    <blockquote>
      "Experience with formally specified knowledge representation
      systems has revealed a trade-off between the expressive power
      of knowledge representation systems and their computational
      complexity. If, for example, a knowledge representation
      system is as expressive as first order predicate calculus,
      then the problem of deciding what an agent could logically
      deduce from its knowledge base is unsolvable"
    </blockquote>
    <p>
      Do we need in practice to decide what an agent could deduce
      from its logic base? No, not in general. The agent may have
      various kinds of reasoning engine, and in practice also
      various amounts of connectivity, storage space, access to
      indexes, and processing power which will determine what it
      will actually deduce. Knowing that a certain algorithm may be
      nondeterministic polynomial in the size of the entire Web may
      not be at all helpful, as even linear time would be quite
      impractical. Practical computability may be assured by
      topological properties of the web, or the existence of know
      shortcuts such as precompiled indexes and definitive
      exclusive lists.
    </p>
    <p>
      Keeping a language less powerful than first order predicate
      calculus is quite reasonable within an application, but not
      for the Web.
    </p>
    <h2>
      Decidability
    </h2>
    <p>
      A dream of logicians in the last century to find languages in
      which all sentences were either true or false, and provably
      so. This involved trying to restrict the language so as to
      avoid the possibility of (for example) self-contradictory
      statements which can not be categorized as a true or not
      true.
    </p>
    <p>
      On the Semantic Web, this looks like a very academic problem,
      when in fact one anyway operates with a mass of untrustworthy
      data at any point, and restricts what one uses to a limited
      subset of the web. Clearly one must not be able to derive a
      self-contradictory statement, but there is no harm in the
      language being powerful enough to express it. Indeed,
      endorsement systems must give us the power to say "that
      statement is false" and so loops which if believed prove
      self-contradictory will arise by accident or design. A
      typical response of a system which finds a self-contradictory
      statement might be similar to the response to finding a
      contradiction, for example, to cease to trust information
      from the same source (or public key).
    </p>
    <h3>
      Reflection: Quoting, Context, and/or Higher Order Logic
    </h3>
    <p>
      <em>@@hmm... better section heading? maybe just quoting, or
      contexts? one place where we really do seem to need more than
      HOL is <a href="#L736">induction</a>.</em>
    </p>
    <p>
      The fact that there is [Burris p___] "no good set of axioms
      and rules for higher order logic" is frustrating not only in
      that it stumps the desire to write common sense
      mathematically, but also because operations which seem
      natural for electronic commerce seem at first sight to demand
      higher order logic. There is also a fundamental niceness to
      having a system powerful enough to describe its own rules, of
      course, just as one expects to be able to write a compiler
      for a programming language in the same language <em>(@@need
      to study</em> <a href="http://lists.w3.org/Archives/Public/www-archive/2002Apr/0057.html">
      <em>references from Hayes</em></a><em>, esp "Tarski's results
      on meta-descriptions (a consistent language can't be the same
      expressive power as its own metatheory), Montague's paradox
      (showing that even quite weak languages can't consistently
      describe their own semantics)"</em>. When Frege tried
      second-order logic, I understand, Russel showed that his
      logic was inconsistent. But can we make a language in which
      is consistent (you can't derive a contradiction from its
      axioms) and yet allows enough to for example:-
    </p>
    <ul>
      <li>Model human trust in a realistic way
      </li>
      <li>Write down the mapping from XML to RDF logic to allow a
      theorem to be proved from the raw XML (and similarly define
      the XML syntax in logic to allow a theorem to be proved from
      the byte stream), and using it;
      </li>
    </ul>
    <p>
      The sort of rule it is tempting to write is such as to allow
      the inference of an RDF triple from a message whose semantic
      content one can algebraically derive that triple.
    </p>
    <pre>forall message,t, r, x, y (
  (signed(message,K)
    &amp; derivable(t, message)
    &amp; subject(t, x)
    &amp; predicate(t, r)
    &amp; object(t, y))
   -&gt; r(x,y)
)
</pre>
    <p>
      (where K is a specific constant public key, and t is a
      triple)
    </p>
    <p>
      This breaks the boundary between the premises which deal with
      the mechanics of the language and the conclusion which is
      about the subject-matter of the language. Do we really need
      to do this, or can we get by with several independent levels
      of machinery, letting one machine prepare a "believable"
      message stream and parse it into a graph, and then a second
      machine which shares no knowledge space with the first, do
      the reasoning on the result? To me this seems hopeless, as
      one will in practice want to direct the front end's search
      for new documents from the needs of the reasoning by the back
      end. But this is all hunch.
    </p>
    <p>
      Peregrin tries to categorize the needs for and problems with
      higher order logic (HOL) in [Peregrin]. His description of
      Henkinian Understanding of HOL in which predicates are are
      subclass of objects ("individuals") seems to describe my
      current understanding of the mapping of RDF into logic, with
      RDF predicates, binary relations, being subclass of RDF
      nodes. Certainly in RDF the "property" type can be deduced
      from the use of any URI as a predicate:
    </p>
    <p>
      forall p,x,y p(x,y) -&gt; type(p, property)
    </p>
    <p>
      and we assume that the "assert" predicate
      &lt;rdf:property&gt; is equivalent to the predicate itself.
    </p>
    <p>
      forall p,x,y assert(p,x,y) &lt;--&gt; p(x,y)
    </p>
    <p>
      so we are moving between second-order formulation and
      first-order formulation.
    </p>
    <p>
      (2000) The experience of the [<a href="#PCA">PCA</a>] work
      seems to demonstrate that higher order logic is a very
      realistic way of unifying these systems.
    </p>
    <p>
      (2001) The treatment of contexts in [<a href="#CLA">CLA</a>]
      seems consistent with the design we've implemented.
    </p>
    <h2>
      <a name="L736">Induction, primitive recursion, and
      generalizing to infinitely many cases</a>
    </h2>
    <p>
      It seems clear that FOL is insufficient in that some sort of
      induction seems necessary.
    </p>
    <blockquote>
      <p>
        I agree with Tait (Finitism, J. of Philosophy, 1981, 78,
        524-546) that PRA is THE NECESSARY AND SUFFICIENT logic for
        talking about logics and proofs
      </p>
      <p>
        <a href="http://theory.stanford.edu/people/uribe/mail/qed.messages/22.html">
        Robert S. Boyer, 18 Apr 93</a>
      </p>
    </blockquote>
    <p>
      also: <a href="/2001/03swell/pra.n3">pra.n3</a>, an N3
      transcription of <a href="http://www.earlham.edu/~peters/courses/logsys/recursiv.htm">Peter
      Suber, Recursive Function Theory</a>
    </p>
    <p>
      also: ACL2: <a href="http://www.cs.utexas.edu/users/moore/publications/km97a.ps.gz">
      A Precise Description of the ACL2 Logic</a> Kaufmann and
      <a href="http://www.cs.utexas.edu/users/moore/">Moore</a> 22
      Apr 1998, <a href="http://rdfig.xmlhack.com/2002/03/26/2002-03-26.html#1017177958.271019">
      rdf scratchpad entry 26Mar</a>
    </p>
    <p>
      (for another sort of induction, i.e. as opposed to deduction,
      see: <a href="http://www-formal.stanford.edu/jmc/circumscription.html">Circumscription</a>
      by McCarthy, 1980.)
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Oct 2001 00:00:00 GMT</pubDate>
  <title>A quick look at iCalendar</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/LookAtiCalendar.html</link>
    <guid>https://www.w3.org/DesignIssues/LookAtiCalendar.html</guid>
      <description><![CDATA[
      

<h2>Building an RDF model:</h2>

<h1>A quick look at iCalendar</h1>

<p>I spent a few hours reading 50 pages of the iCalendar <a href="http://www.ietf.org/rfc/rfc2445.txt">RFC2445</a> with a view to
evaluating proposals to put it into XML.  My conclusion early on was that the
spec should be written in terms of <a href="/RDF">RDF</a> properties,
particularly as it has a clear property/value and parameter/value
structure.</p>


<h2><a name="Summary">Summary</a></h2>

<p>General points I noticed included</p>
<ol>
  <li>The spec is full of x-extensions and IANA registries. these would all
    be done using namespaces in XML</li>
  <li>There is no summary of properties with their domains and ranges, which
    would make the spec much clearer.</li>
  <li>The parameter value type of "URI" implicitly causes dereferencing. This
    is not clear from the spec but is assumed by the examples.</li>
  <li>There are a few example of wanton reification, e.g. relationship
  type.</li>
  <li>Encodings, for cleanliness: the encoding is a relationship between two
    objects, not the property of an object. Same comment on XML DSig.</li>
  <li>I am concerned that I have not found very much protocol defining what
    how agents interact, or what a message containing a calendar entry means.
    But maybe that is elsewhere in the spec.</li>
</ol>

<h2><a name="Narrative">Narrative</a></h2>

<p>When looking for a natural representation of data in a given lanbguage in
RDF, one looks at first for the natural structureo fthe language. iCalndar
has a nested set of structures which naturally lend themselves to an RDF
graph interpretation. Apart from the noted exceptions, this translatoin leads
to a set of fairly logically defined RDF properties which could form
iCalendar's contribution to the semantic web.</p>

<p>A "calendar" consists of a set of components, such as events, and to-do
list and journal entries.  These seem natural RDF types.  (There is a choice
of whether to introduce special a specific property as the relationship
between the containing calendar and a specfic type of component, or whther to
use generic inclusion property and then specifythe subtype of the
component.)</p>

<p>The components have properties, even known as properties in iCalendar. Now
each property is in fact a complex thing which has a "value" (implcitly
named) and various "parameters" with names.</p>

<p>The named parameters are clearly easily represented  as RDF properties.</p>

<p>The values are generally atomic things suhc as integers and strings, with
two exceptions. One is when the valeu if the URI and this implies that the
actual value is in a document with that URI.  Another is that the value
datatype "rcecur"is a string which itself has a substructure. This recurrence
substructure takes the form of (guess what!) a set of attribute value
pairs.</p>

<h2><a name="Detailed">Detailed comments</a></h2>

<h3>2.3 Internationalization</h3>

<p>If this were XML this would be done for you, with Unicode and the  various
encodings etc.</p>

<h3>4.1 Content Lines</h3>

<p>x-name and iana-token are extensions which XML would give for free using
namespaces.</p>

<p></p>

<p>"Each property defines the specific ABNF for the parameters allowed on the
property"</p>

<p>This makes general parsing impossible, direct conversion into XML
difficult.  The only hope is that in fact that it not true and there is more
consistency than this line leads you to believe! This sounds like a remake of
the RFC822 problem which HTTP has in spades: One parser per page of the
spec.</p>

<h3><a name="4.1.3">4.1.3</a></h3>

<p>Here in the example</p>
<pre>ATTACH;FMTTYPE=image/basic;ENCODING=BASE64;VALUE=BINARY:
      MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcNAQEEBQAwdzELMAkGA1U
      EBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bmljYXRpb25zIE        &lt;...remainder of "BASE64" encoded binary data...&gt;</pre>

<p>represents the encoding as though it were a property of the value. It
isn't: it is a relationship between the value and thestring expressed here.
Nicer to write that.</p>

<p></p>
<pre>&lt;attach&gt;
   &lt;fmttype&gt;image/basic&lt;/fmttype&gt;
   &lt;base64&gt;MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN
     [...]
   &lt;/base64&gt;
&lt;/attach&gt;</pre>

<p>which would mean (in XML or RDF nonstriped strawman syntax) "Something is
attached which has content type  image/basic and has base64 encoding
MMICCblablahblah".</p>

<p>Note that making base64 a first class relationship (subclass of encoding)
makes for brevity and extensibility: with a namespace I can introduce a new
one.</p>

<p>Value=binary has all these problems and is unnecessary. It is assumed in
base64. The earlier example with the URI</p>
<pre>ATTACH:http://xyz.com/public/quarterly-report.doc</pre>

<p>has an implicit dereferencing operation which it would be best to
expose:</p>

<p></p>
<pre>&lt;attach&gt;
   &lt;uri&gt;http://xyz.com/public/quarterly-report.doc
&lt;/uri&gt;
&lt;/attach&gt;</pre>

<p>which means, consistently with the previous example, "something is
attached which is identified by URI http://...."</p>

<h3>4.2 property parameters</h3>

<p>Property parameter values MUST NOT contain a double quote. So I guess that
if i want to represent something which does... I attach it?</p>

<p></p>

<p>4.2.1</p>

<p>ALTREP and many of the following parameters can be represented obviously
as RDF properties. There needs to be an explicit property between the
introduced thing and any "value".</p>
<pre>&lt;description&gt;
   &lt;altrep&gt;cid:asdfsadf@sdfsdaf.com&lt;/altrep&gt;
   &lt;text&gt;Proext XYZ review meeting&lt;/text&gt;
&lt;description&gt;</pre>

<p>This becomes more obvious when you look at things like ATTENDEE.</p>

<p></p>

<p>4.2.2.</p>

<p>There seems to be an embryonic notion of type here ("properties with the
CAL-ADDRESS value type". I assume this can be formalized. it would be so much
simpler if this were tabulated.</p>

<p></p>

<p>4.2.3 Calendar User Type.</p>

<p>"mailto:" is usually in lower case. I thought it was in fact mandatory
that it be in lower case.</p>

<p></p>

<h3>4.2.5 Delegatees</h3>

<p>It is very confusing who ends up being the attendee notionally when both
delegates-to and -from are specified. Changing this to RDF, or contemplating
doing logical operations on this make one queasy about the solidity here.</p>

<p>ATTENDEE;DELEGATED-TO="mailto:a@y.com";DELEGATED-FROM="mailto:b@y.com":c@y.com</p>

<p>What is that equivalent to? I assume a@y.com goes to the meeting.</p>

<p></p>

<h3>4.2.7. See comment about 4.1.3</h3>

<h3>4.2.9 Free/Busy Time type</h3>

<p>make relationships first class</p>

<p>FREEBUSY=FREE: would be better as FREE: to reduce unnecessary complication
and allow extension.</p>

<p>If that section of the spec (4.2.9) seems to be self-referential and
difficult to read, that is also because it is describing an unnatural part of
a clumsy syntax.  You don't say "I am free or busy as follows: 12-1pm and we
are talking about free here"!  because RDF makes these things first class
objects and allow you to group FREE and BUSY and REALLYBUSY as subclases of
FREEBUSYTYPE life is easier.</p>

<h3>4.2.10 language</h3>

<p>xml:lang of course is what one would get for free with XML.</p>

<h3>4.2.15</h3>

<p>"RELATED-TO:RELTYPE=SIBLING" is a classic wanton reification. Just say
SIBLING:</p>

<p>Unfortunately the specification defined how calendars can be put into a
hierarchical relationship but doesn't say what that relationship *means*.
Maybe it does later in the spec.</p>

<p></p>

<h3>4.2.18 Sent By</h3>

<p>This is a relationship between a mailbox and another mailbox.  It is that
the owner of one mailbox is being represented by the owner of another. Yes,
the message which asserted this data was probably sent by the agent, but the
term is misleading when it crops up in the data.  This will cause confusion.
This is an example of the clarification which arises when you try to
represent the meaning of each rdf:property (icalendar:parameter)
independently.</p>

<h3>4.2.20 Value Data Type</h3>

<p>Note that the "URI" data type does not just constrain the value string to
be a valid URI, but indicated that the value string is the document you get
when you dereference the URI. Big difference, particularly when you automate
the base 64 decoding of something.</p>

<p></p>

<p>In general, note XML data types are defined by XML schema working group.
See draft @@. A comparison would be a useful exercise.</p>

<h3>4.8.4.1 <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor105">ATTENDEE</a></h3>

<p>"If the LANGUAGE property parameter is specified, the identified language
applies to the CN parameter"</p>

<p>That is a terrible bit of design - a typical bit of interference between
different headers which is so temping for designers in these flat specs which
can't use nesting.  How many other clauses like this are there?</p>

<p>LANGUAGE is, I must admit, a problem RDF has a bug with in general. It is
difficult to specify that a string has a language without making an
intermediate node that you don't want. This is, I realize the same as the
intermediate packaging problem: how to let a system know that what it asked
for is inside, but in the mean time, here is some useful information about
it. Here is a number and by the way it is prime. here is a GIF and by ht way
it is copyright. Here is a common name and by the way it is in English. It is
interesting to see the way iCalendar has the same problem</p>

<p></p>

<h3>4.8.4 <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor111">UID</a></h3>

<p>There is linking between components of calendars which uses "UIDs" which
are mid URIs with the prefix removed. This is a bug</p>
<ul>
  <li>It removes calendar objects from the URI space so that one cannot refer
    to them with any other system which uses a URI -- unless you simply
    assume that you can by using mid:</li>
  <li>The spec is full of recommendations for making identifiers unique.</li>
  <li>It has a given length of 255 characters which is y2k bug asking to
    happen. Never specify fixed buffer sizes.</li>
</ul>

<p></p>

<h4>4.8.7.4 "<a href="http://xml.resource.org/public/rfc/html/rfc2445.html#anchor125">SEQUENCE</a>"</h4>

<p>This is not in fact a property of an event, but is a property of a given
expression of the state of an event. the rule is that it must be incremented
by the organizer if the event changes significantly. In a peer-peer world, it
is not obvious what to do. </p>

<p></p>

<p></p>

<h2><a name="reviewed">Not reviewed</a></h2>

<p>I skipped most of the rest of the spec but a few very similar concerns
arose with some other parts I glanced at.</p>

<p></p>

<h2><a name="Conclusion">Conclusion</a></h2>

<p>It seems that RDF nodes for the calendar, for each event etc, and for each
icalendar:property is a fairly straightforward mapping.</p>

<p>A spinoff would be a vocabulary which would include useful reusable models
of time.The timezone work could be factored out if it is definitive.</p>

<p>Where RDF mapping was not obvious this sometimes coincided with unclear
aspects of the specification.</p>

<p>There are three levels at which the RDF mapping could be made</p>
<ol>
  <li>A very direct mapping of the ical:properties and parameters onto
    rdf:properties. Always use the same "value" rdf:property for the VALUE of
    an ical:property.   This would leave some things looking illogical in
    RDF. It  would be simple to define as a mapping, but the definitoin of
    the properties would be strange in some cases.</li>
  <li>Make a few simple adjustments to make the RDF more natural.  Places to
    lok for these arehese have been indicated with a @@ in the table. This
    will make the mapping obvious to an iCal expert reading the RDF, but at
    the same time make the RDF queries simpler and the properties more
    reusable. It would move things like RELATED RELTYPE=X  into a subclass
    relationship between X and RELATED which allows generic RDF machinery to
    process it.</li>
  <li>An extensive rework in which the logic of rules was largely exposed in
    RDFS or something stronger would of course be great.</li>
</ol>

<h3><a name="Appendix:1">Appendix: Node types</a></h3>

<table border="1">
  <caption>Node types infered</caption>
  <tbody>
    <tr>
      <td>party</td>
      <td>implicit node in all properties with a CAL-ADDRESS value type.
        (person or group: anything which can have a mailbox)</td>
      <td></td>
    </tr>
    <tr>
      <td>cal-address</td>
      <td>A mailbox  - normally mailto:...</td>
      <td>URI</td>
    </tr>
    <tr>
      <td>CU</td>
      <td>Calendar user defined in <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor29">CUTYPE</a></td>
      <td></td>
    </tr>
    <tr>
      <td>INDIVIDUAL, GROUP, RESOURCE, ROOM</td>
      <td></td>
      <td>CU</td>
    </tr>
    <tr>
      <td>ldap-directory</td>
      <td>starts "ldap:" (is this a standard?)</td>
      <td>URI</td>
    </tr>
    <tr>
      <td>mime-type</td>
      <td></td>
      <td>string</td>
    </tr>
    <tr>
      <td>participation status</td>
      <td>needs-action, accepted, declines, tentative, delegated, ...  (an
        enum type- could do better. <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor38">Constraints
        in the spec</a>.)</td>
      <td>string</td>
    </tr>
    <tr>
      <td>component</td>
      <td>of a calendar</td>
      <td></td>
    </tr>
    <tr>
      <td>EVENT, TODO, etc</td>
      <td></td>
      <td>component</td>
    </tr>
    <tr>
      <td>TimeProperty</td>
      <td>DTSTART, DTEND, DUE, EXDATE, RDATE</td>
      <td></td>
    </tr>
    <tr>
      <td>Timezone</td>
      <td>see <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor45">TZID</a></td>
      <td>string</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor62">icalobject</a></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>recur</td>
      <td>defined by recurrence properties</td>
      <td>-Really complex datatype could be broken down into RDF! Contains
        its own nested attr/value structure.</td>
    </tr>
  </tbody>
</table>

<h2>Appendix: rdf:Properties - from "parameters"</h2>

<table border="1">
  <caption>Properties from section 4</caption>
  <tbody>
    <tr>
      <th>iCalendar name</th>
      <th>domain</th>
      <th>range</th>
      <th>Notes</th>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor27">ALTREP</a></td>
      <td>anything iCal property?</td>
      <td>URI</td>
      <td>altervative to body</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor28">CN</a></td>
      <td>party</td>
      <td>string</td>
      <td></td>
    </tr>
    <tr>
      <td>: (mailbox)</td>
      <td>party</td>
      <td>cal-address</td>
      <td>Implicit node between a party and that part's mailbox. Represted by
        "value" of property</td>
    </tr>
    <tr>
      <td>CUTYPE - type</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor30">DELEGATED-FROM</a></td>
      <td>party</td>
      <td>cal-address</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor31">DELEGATED-TO</a></td>
      <td>party</td>
      <td>cal-address</td>
      <td></td>
    </tr>
    <tr>
      <td>DIR</td>
      <td>party</td>
      <td>URI</td>
      <td></td>
    </tr>
    <tr>
      <td>eightbit, base64</td>
      <td>bits</td>
      <td>text</td>
      <td>text encodes bits accordingto RFC2045. Was value of encoding
        "property"which was faulty model. Now, subclass of generic
        ëncoding"property</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor33">ENCODING</a></td>
      <td>bits</td>
      <td>text</td>
      <td>Only in schema, as superclass of eightbit and base64 See <a href="#4.1.3">notes</a></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor34">FMTTYPE</a></td>
      <td>document</td>
      <td>mime-type</td>
      <td>Why not call it content-type?! Applies to a document. Expect the
        implit uri proprerty to tell you which object.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor35">FBTYPE</a></td>
      <td></td>
      <td></td>
      <td>Supertype of the following</td>
    </tr>
    <tr>
      <td>FREE, BUSY, BUSY-UNAVAILABLE, BUSY-TENTATIVE</td>
      <td>?</td>
      <td>time-interval</td>
      <td>enum became subclasses FBTYPE property</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor36">LANGUAGE</a></td>
      <td>string-or-doc</td>
      <td>iso-language</td>
      <td>Equivalent xml:lang</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor37">MEMBER</a></td>
      <td>party</td>
      <td>cal-address</td>
      <td>group membership</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor38">PARTSTAT</a></td>
      <td>party</td>
      <td>enum</td>
      <td>A status: part of some protocol?</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor39">RANGE</a></td>
      <td>component</td>
      <td></td>
      <td>superclass only of ...</td>
    </tr>
    <tr>
      <td>THIS-AND-PRIOR, THISANDFUTURE</td>
      <td>component</td>
      <td>date-time</td>
      <td>subclass of RANGE (was qualifier)</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor40">RELATED</a></td>
      <td>component</td>
      <td>period@@</td>
      <td>superclass of TRIGGER-FROM-START and TRIGGER-FROM-END?</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor41">RELTYPE</a></td>
      <td>component</td>
      <td>component</td>
      <td>Superclass only, of</td>
    </tr>
    <tr>
      <td>PARENT, CHILD, SIBLING</td>
      <td>component</td>
      <td>component</td>
      <td>Subclases of RELTYPE. Hierarchical constraints. Semantics
      unclear@@.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor42">ROLE</a></td>
      <td>party</td>
      <td>enum roleparam</td>
      <td>Attendee; role=chair could it be better "chair?".  Wait and see
        wether it is a separate dimension.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor43">RSVP</a></td>
      <td>party</td>
      <td>boolean</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor44">SENT-BY</a></td>
      <td>party</td>
      <td>cal-address</td>
      <td>Misleading. "Represented by" would be better. Some message was
      sent.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor99">TZID</a></td>
      <td>anything taking time or D</td>
      <td>timezone</td>
      <td>Yuk. should be part of the time string. Makes time complictaed</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor46">VALUE</a></td>
      <td>string-or-doc</td>
      <td>string</td>
      <td>Superclass of the following</td>
    </tr>
    <tr>
      <td>BINARY, BOOLEAN, CAL-ADDRESS, DATE, DATE-TIME DURATION, FLOAT,
        INTEGER, PERIOD, RECUR TEXT, TIME, URI, UTC-OFFSET"</td>
      <td>string</td>
      <td>string</td>
      <td>Specifies the datatype of an associated string</td>
    </tr>
    <tr>
      <td>URI</td>
      <td>document</td>
      <td>URI</td>
      <td>Subclass of VALUE but indicates the vale is the <em>content</em> of
        the resouce identified.</td>
    </tr>
    <tr>
      <td>calprop</td>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor64">icalobject</a></td>
      <td></td>
      <td>superclass for the following</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor75">VERSION</a></td>
      <td>icalobject</td>
      <td>string</td>
      <td>subclass of calprop. unique.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor74">PRODID</a></td>
      <td>icalobject</td>
      <td>string</td>
      <td>subclass of calprop

        <p>semantics? unique.</p>
      </td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor72">CALSCALE</a></td>
      <td>icalbobject</td>
      <td>string</td>
      <td>subclass of calprop</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor73">METHOD</a></td>
      <td>icalobject</td>
      <td>string</td>
      <td>This is a hook for a protocol definition</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor65">VEVENT</a></td>
      <td>icalobject</td>
      <td>event</td>
      <td>Property VENVENT of calendar implies component is of type event.
        See spec for properties including this in their domain</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor66">VTODO</a></td>
      <td>icalobject</td>
      <td>todo</td>
      <td>similar</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor67">VJOURNAL</a></td>
      <td>icalobject</td>
      <td>journal</td>
      <td>similar</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor68">VFREEBUSY</a></td>
      <td>icalobject</td>
      <td>freebusy</td>
      <td>similar</td>
    </tr>
    <tr>
      <td>VTIMEZONE</td>
      <td>icalobject</td>
      <td>timezonedef</td>
      <td>similar Definition of a timezone.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor70">VALARM</a></td>
      <td>?component</td>
      <td>alarm</td>
      <td>can nest in component</td>
    </tr>
    <tr>
      <td>CALSCALE</td>
      <td>icalobject</td>
      <td></td>
      <td></td>
    </tr>
  </tbody>
</table>

<p></p>

<h2><a name="Appendix:">Appendix: Calendar component  Properties</a></h2>

<p>See <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor77">spec
4.8</a></p>

<p>The columns E, T etc indicate whether the subject of the property is
permitted to be an event, todo, journal, freebusy, alarm or timezone
component.</p>

<table border="1">
  <caption>Properties of calendar components</caption>
  <tbody>
    <tr>
      <td>iCalendar name</td>
      <td>E</td>
      <td>T</td>
      <td>J</td>
      <td>F</td>
      <td><p>A</p>
      </td>
      <td>Tz</td>
      <td>range</td>
      <td>Notes</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor78">ATTACH</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td></td>
      <td>text-or-doc</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor79">CATEGORIES</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>text</td>
      <td>List of enums</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor80">CLASS</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>classification</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor81">COMMENT</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>text</td>
      <td>no comment</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor82">DESCRIPTION</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td></td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor83">GEO</a></td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>float float</td>
      <td>lat long. @@ Split into two properties?</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor84">LOCATION</a></td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor85">PERCENT-
        COMPLETE</a></td>
      <td></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>integer</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor86">PRIORITY</a></td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>integer</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor87">RESOURCES</a></td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor88">STATUS</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>text</td>
      <td>enum - see the spec.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor89">SUMMARY</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td></td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor91">COMPLETED</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>date-time</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor92">DTEND</a></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>date-time or date</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor93">DUE</a></td>
      <td></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>date-time or date</td>
      <td></td>
    </tr>
    <tr>
      <td>DTSTART</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td>date-time or date</td>
      <td></td>
    </tr>
    <tr>
      <td>DURATION</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td>duration</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor96">FREEBUSY</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>period</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor97">TRANSP</a></td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>text</td>
      <td>really boolean!</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor99">TZID</a></td>
      <td>a</td>
      <td>a</td>
      <td>a</td>
      <td>a</td>
      <td>a</td>
      <td>a</td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor100">TZNAME</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>y</td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor101">TZOFFFROM</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>y</td>
      <td>utc-offset</td>
      <td>like -0500</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor102">TZOFFTO</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>y</td>
      <td>utc-offset</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor103">TZURL</a></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>y</td>
      <td>URI</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor105">ATTENDEE</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>party</td>
      <td>@@ If language is specified, it applies to CN: Kludge! @@@</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor106">CONTACT</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>text</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor107">ORGANIZER</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>party</td>
      <td>Note in FREEBUSY the use is different</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor108">RECURRENCE-ID</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>date-time or date</td>
      <td><strong>Could be a problem</strong>. Not a property of an event,
        but its presence makes it a reference to a specific occurrence of a
        repeated event.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor109">RELATED-TO</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>text (really URI whcih is UID of component)</td>
      <td>Subclass only of PARENT, CHILD, SIBLING above.</td>
    </tr>
    <tr>
      <td>PARENT , CHILD, SIBLING</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td>see RELATED-TO</td>
    </tr>
    <tr>
      <td>URI</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>URI</td>
      <td>document "associated with" component.  For more information.</td>
    </tr>
    <tr>
      <td>UID</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td>UID - URI without  mid:</td>
      <td>@@ Missing scheme!!! @@ replace with midL: URI</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor113">EXDATE</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>date-time or date</td>
      <td>Excludes the dates given @@ implicit logic makes search logic
        difficult.</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor114">EXRULE</a></td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>recur</td>
      <td></td>
    </tr>
    <tr>
      <td>RDATE</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>date-time or date</td>
      <td></td>
    </tr>
    <tr>
      <td>RRULE</td>
      <td>y</td>
      <td>y</td>
      <td>y</td>
      <td></td>
      <td></td>
      <td></td>
      <td>recur</td>
      <td></td>
    </tr>
  </tbody>
</table>

<p></p>

<table border="1">
  <caption>Properties ofAlarm coponents and config control and misc</caption>
  <tbody>
    <tr>
      <th>name</th>
      <th>domain</th>
      <th>range</th>
      <th>Notes</th>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor118">ACTION</a></td>
      <td>A</td>
      <td>text</td>
      <td>really an enum</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor119">REPEAT</a></td>
      <td>A</td>
      <td>Ainteger</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor120">TRIGGER</a></td>
      <td>A</td>
      <td>duration or date-time</td>
      <td>See RELATED. @ Split into two properties?</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor122">CREATED</a></td>
      <td>ETJ</td>
      <td>date-time</td>
      <td></td>
    </tr>
    <tr>
      <td>DTSTAMP</td>
      <td>ETJF</td>
      <td>date-time</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor124">LAST-MODIFIED</a></td>
      <td>ETJTz</td>
      <td>date-time</td>
      <td></td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor125">SEQUENCE</a></td>
      <td>ETJ</td>
      <td>integer</td>
      <td>fuzzy rules for incrementing this</td>
    </tr>
    <tr>
      <td><a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor128">REQUEST-STATUS</a></td>
      <td>ETJF</td>
      <td>text</td>
      <td>eg 3.1.1</td>
    </tr>
  </tbody>
</table>

<p></p>

<table border="1">
  <caption>Properties from <a href="http://memory.palace.org/public/rfcs/html/rfc2445.html#anchor57">recurrence
  rules</a></caption>
  <tbody>
    <tr>
      <th>name</th>
      <th>domain</th>
      <th>range</th>
      <th>Notes</th>
      <td></td>
    </tr>
    <tr>
      <td>UNTIL</td>
      <td>rrule</td>
      <td>text</td>
      <td rowspan="13">text - all these are text with various constraints and
        substructure</td>
      <td></td>
    </tr>
    <tr>
      <td>COUNT</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>INTERVAL</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYSECOND</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYMINUTE</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYHOUR</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYDAY</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYMONTHDAY</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYYEARDAY</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYWEEKNO</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYMONTH</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>BYSETPOS</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>WKST</td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td>FREQ</td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
  </tbody>
</table>

<p></p>

<table border="1">
  <caption>Properties of</caption>
  <tbody>
    <tr>
      <th>name</th>
      <th>domain</th>
      <th>range</th>
      <th>Notes</th>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
  </tbody>
</table>

<p></p>

<h2><a name="Examples">Examples</a></h2>
<pre>@@@</pre>

<h2 id="References">References</h2>

<p>There must be a much better list of resources for hacking calendar files
of various formats - but until I find it here are some random things I
found.</p>
<ul>
  <li>The iCalendar RFC:  <a href="http://www.ietf.org/rfc/rfc2445.txt">RFC2445</a></li>
  <li>Jetstream: <a href="http://java.apache.org/jetspeed/api/org/apache/jetspeed/calendar/properties/package-summary.html">Java
    classes in Apache's Jetstream</a> which represent the iCalendar
    properties.</li>
  <li>Open source <a href="http://www.openhandheld.org/software.html#desktop">handheld
    synchronisation software </a>at openhandheld.org</li>
  <li><a href="http://www.palmos.com/dev/tech/docs/">PalmOs
    documentation</a>; file formats (<a href="/2000/10/Palm/fileformats.pdf">pdf copy</a>)</li>
  <li><a href="/People/Connolly/drafts/web-research#when">Dan Connolly's
    design research notebook on this</a></li>
</ul>

<p></p>
      
      
    
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2000 00:00:00 GMT</pubDate>
  <title>Mandatory extensions</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Mandatory.html</link>
    <guid>https://www.w3.org/DesignIssues/Mandatory.html</guid>
      <description><![CDATA[
    <h1>
      Mandatory extensions
    </h1>
    <p>
      There is a common requirement for the design of a language on
      the web that it should allow for extensions, but it must
      allow a clear declaration as to whether understanding of an
      extension is a requirement to understanding of the document
      or whether it may be ignored. (See <a href="Evolution.html">Evolvability</a>)
    </p>
    <p>
      Historically the lack of such a "mandatory field" has led to
      a complete inabaility to get any particular guaranteed
      behaviour be clients on the web.
    </p>
    <p>
      This is essential for partial understanding and the smooth
      evolution of the web.
    </p>
    <p>
      A simple requirement on a language is that it not only
      provide for its own extension, but provides for a way to
      explain whether a given extension is optional or not. This is
      a fundamental key to smooth evolution from the language to a
      new version.
    </p>
    <p>
      There are manyways in which it can be done. It can be done
      term by term, or in bulk about a whole new language. It can
      be specified in the new document, in the schema for the new
      language.
    </p>
    <p>
      XML provides in Namespaces a standard way of extending
      languages. It should also, in my opinion, provide a standard
      way to specify mandatopry or optional extensions.
    </p>
    <p>
      I propose two things:
    </p>
    <h3>
      Sublanguages
    </h3>
    <p>
      The simple assertion that language A is a sublanguage of
      language B means that the writer's intent is preserved if a
      dpcument in language A is converted into a document in
      language B just by relabelling every term as being from
      langauge B. For XML, this means that a receiver of namespace
      A can simply process it as though the namespace had been
      delcared as B.
    </p>
    <p>
      This assertion has got to be simple enough to put into a
      document for cases where the functionality is needed without
      the receiver having to dereference a schema.
    </p>
    <h3>
      Optional/Ignoreable/Mandatory flags for elements
    </h3>
    <p>
      In XML there are three simple thiong you can do with an
      element you don't understand.
    </p>
    <ol>
      <li>Stop, and conclude you do not understand the document, or
      the clause in the document; Example : logical NOT
      </li>
      <li>Ignore the elementand all its contents (including child
      elements) Example; &lt;Comment&gt;
      </li>
      <li>Replace the element with its contents (including
      children). Example: &lt;bold&gt;
      </li>
    </ol>
    <p>
      The schema langauge needs to be able to specify these very
      simply, and indeed it would be neatto be able to do it in a
      document for a given elemnt, or in one fell swoop for all the
      elements in a given namespace.
    </p>
    <p>
      Languages which donot use XML should attend to these needs in
      their own way!
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Jan 1999 00:00:00 GMT</pubDate>
  <title>The meaning of a document - grounding in a global namespace</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Meaning.html</link>
    <guid>https://www.w3.org/DesignIssues/Meaning.html</guid>
      <description><![CDATA[
    <h1>
      Meaning
    </h1>
    <p>
      <em>Grounding the meaning of a document in URI space.</em>
    </p>
    <p>
      What is the meaning of a document?
    </p>
    <p>
      The meaning of a document on the Web can be defined more
      precisely than an arbitrary paper document. Because we have
      the benefit of a global namespace (URIs), things become
      possible which were not before. One example is global
      hypertext; another is the rigid (though rarely absolute)
      specification of meaning. Just as a hypertext document can
      now exactly point to another document when it makes a
      reference (instead of making some vague natural language
      reference to it), so can a formal document make a precise
      reference to the language it uses.
    </p>
    <p>
      A writer of a document uses the language to convey his intent
      to the reader. It is essential that the intent of the writer
      can be well defined for both parties and in general for a
      third party.
    </p>
    <p>
      The "<dfn>language</dfn>" here I means the set of symbols,
      the syntactic rules which constrain their combination, and
      some semantics which are conveyed by defining their
      interpretation in one or more other formal language, or in
      some natural language.
    </p>
    <table border="5">
      <tbody>
        <tr>
          <td>
            The meaning of a document is then the product of the
            text of the document (in some language) and the meaning
            of the language.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      On the Web, <a href="Axioms.html#Universality2">important
      things are identified by URIs</a>. This should clearly apply
      both to the document itself and to the language. The party
      which defines what a URI refers to I call the publisher, or
      owner of the URI. HTTP allows a delegated system of authority
      for ownership (DNS) to define ownership of URIs, and it also
      provides a network protocol to retrieve documents
      representing that identified by the URI. The text a document
      is defined by its publisher and the meaning of the language
      is defined by the publisher of the language.
    </p>
    <p>
      Natural languages are constantly evolving and rather vague,
      in that no one (except <em>Scrabble</em> players) use a
      particular dictionary as a definitive set of meanings. In
      practice, the meaning of a word in a natural language is the
      sum of the associations of that word -- logical or poetic --
      in the mind of the reader or writer. Of course society works
      on the basis of a very strong similarity of the webs of
      association in different people's minds.
    </p>
    <p>
      In the semantic web, however, meaning is not vague: the idea
      is that languages must be defined formally and as precisely
      as possible. The semantic web consists of some "terminal"
      languages which are defined solely in natural language terms,
      and some languages for which there are machine-readable
      interpretations into other formal languages. Whereas programs
      processing documents in the first sort of language will
      typically have to be hand coded, documents in the second set
      may be processed automatically to convert them into languages
      in the first set.
    </p>
    <p>
      URIs can be of various sorts, with various properties
      depending on their scheme (and, for http URIs, the
      publisher), but some URIs can be dereferenced to a definitive
      document. The document resulting from dereferencing the URI
      for a language is a place where the publisher of the language
      can put definitive information about the meaning of a
      language.
    </p>
    <h3>
      <a name="Language1" id="Language1">Language and document
      subsets</a>
    </h3>
    <p>
      As languages evolve, there can be many languages which are
      similar. "Similarity" doesn't mean much, but something which
      is well defined is when a document in one language A can be
      treated precisely as though it had been in another language
      B.
    </p>
    <h3>
      <a name="Meaning1" id="Meaning1">Meaning in XML</a>
    </h3>
    <p>
      In XML, a language is a "namespace", and the document about
      the language is called a "schema". In XML, one document can
      contain a mixture of languages, and so the schema if written
      in XML may contain information about syntactic constraints
      (in XML-schema language) and/or RDF properties (in rdf-schema
      language), or any combination of the above. (<a href="#Language">note</a>)
    </p>
    <p>
      XML puts no constraints on a language apart from syntactic
      structure. There is not (without RDF and logic or some other
      higher level) any overall framework into which new languages
      can be introduced. So, the question of <strong>what an XML
      document means depends</strong> first upon the fully
      qualified name of the <strong>document element</strong>. No
      semantics can be attached to any of its descendents in the
      document tree except in as much as is defined by the
      specification of that element type in that namespace. One
      cannot talk about the "meaning" of a subtree of a document
      without understanding the semantics of the language. In fact,
      because languages only necessarily define meaning for
      documents, the only way one can talk about the meaning of a
      subset of a document is to define a how those parts of the
      document can be reassembled into a second whole document.
      This is what must be done when a digital signature is applied
      to a document.
    </p>
    <h3>
      <a name="Meaning" id="Meaning">The Meaning of Digital
      Signature</a>
    </h3>
    <p>
      The language defines semantics. On the simple philosophy that
      one place is enough, It is not the place of a digital
      signature to define semantics. A digital signature on a
      document may give a party reason to use the information
      therein for purposes it would not have otherwise. The issuer
      of a public key may also put constraints on what sort of
      guarantees are made by signature with a given key. But the
      signature itself must not affect the semantics - the meaning
      - of a document. To allow it to would be to create an
      inconsistency between the intent of the writer of the
      original document and the meaning of the signed document. So,
      signatures themselves have no meaning. The meaning has to be
      ascibed to them by other documents. For example, I may say,
      "If an organization is a member of W3C according to a
      document signed with this key, then that organzation is
      indeed a member". That is a trust statement which gives the
      key a connection into the world of meaning of documents.
    </p>
    <h3>
      <a name="Style" id="Style">Style as meaning</a>
    </h3>
    <p>
      (Although few people would think of presentation style of a
      document as its "meaning", and many of us spend a lot of time
      emphasising the difference between style and content and
      semantics, in fact much of what applies to style applies to
      semantics. Therefore the "meaning in terms of presentation"
      is a good test case for the architecture of the system. (For
      many documentation systems, the only semantics required is
      "H2 means a big bold block on the left"!) Style sheets
      provide an "interpretation"of a document by mapping it onto
      another well-defined language of formatting properties. The
      style sheet language gives a good definition (in English) of
      what is needed. This is an interesting comparison, and I
      mention it as a place where architectural conssistency should
      be maintained, but it isn't what I normally mean by
      "meaning".)
    </p>
    <h3>
      <a name="Logical" id="Logical">Logical meaning</a>
    </h3>
    <p>
      When XML is used to encode logic, then a document is a
      formula and the (see <a href="Logic.html">Logic on the
      web</a>). Then, the way new predicates and constants interact
      is defined by the logic. The way fundamental new parts of the
      language (such as quantification) are added is part of a more
      general question of how arbitrary languages interact.
      Examples we have seen are the mixing of XHTML and XSL. What
      is the result - XHTML or XSL? A document or a style sheet?
      Both?
    </p>
    <h3>
      <a name="Mixing" id="Mixing">Mixing Languages</a>
    </h3>
    <p>
      XML puts no contarints on a language apart from syntactic
      structure. There is not (without for example RDF and logic)
      some overall framework into which new languages can be
      introduced. This means that every language has to define how
      it canbe extended by mixing with other languages. Typically
      it will indicate the element types which can be subclassed by
      extensions and therefore incorporated into documents wherever
      that element type is allowed.
    </p>
    <p>
      One particular example of such a type is common to almost all
      languages. This is the sentence, the fully qualified
      assertion or statement, the formula with no free variables.
      Almost all whole documents count as such, though an
      interesting counterexample is a style sheet which represent a
      function: it specified the result document as a functin of an
      input document, and so itself cannot be said to be a
      stand-alone statement. (If I sent you a message consisting
      only af a stylesheet with no coverletter, what would it
      signify? What would it mean if I digitally signed it?)
    </p>
    <p>
      With that exception, it clearly makes sense to allow any
      language which has the concept of a sentence -- maybe any
      language at all - to allow sentences from other languages to
      be included anywhere where a sentence of its own could go.
      <strong>This should be a generic feature of XML
      schemas</strong>.
    </p>
    <p>
      (It is would be against the minimalist principle for XML
      generically to define other common subclasses. Note that the
      RDF spec does define properties and node types and the
      concept of subclassing in RDF. HTML defines things like block
      and inline elements, which can be subclassed in extensions;
      SVG and SMIL probably define similar concepts. The
      significance of this when looking at downloaded support code
      would be that, for example, in a set of Java classes
      implementing HTML, that any subclass of "Inline element"
      would export the same software API to allow it to be
      justified and line wrapped in a text flow object. So there is
      a natural correspondence between element type subclassing and
      support class subclassing, but the tow must remain distinct.
      Language specifications must always define what a language
      means without refering to implementations if they can
      possibly avoid it)
    </p>
    <p>
      Note that without the assurances given by such information
      you cannot just go around embedding one language in another.
      Every language has to address the issue which the concept of
      RDF transparency potentially solves for RDF. A surrounding
      XML context must have the ability to quote, deny, negate or
      whatever any element. In fact, nothing in XML says that the
      menaing of a fragment is not affected by thing anywhere else
      in a document. Nothing suggests that the process of removing
      sub-trees creates a valid document. (How does xml fragment
      deal with this?)
    </p>
    <h3>
      <a name="Grounded" id="Grounded">Grounded documents</a>
    </h3>
    <p>
      We can say a document is "grounded" if its meaning is
      completely defined because every term used is explicitly,
      directly or indirectly, an explicit direct or indirect
      referece to its definition in a document on the Web. Clearly
      a definition of "grounding" depends on the set of documents
      one considers acceptable definitions. "Grounded in W3C
      Recommendations" would imply that the closure under [i.e. set
      of all the things you can possibly end up with by repeated
      applications of] the operation of looking up definitions
      would be a subset of the set of W3C recommendations.
    </p>
    <p>
      This is the basis for the entire web and internet
      architecture stack today. (See also: <a href="Stack.html">Stack</a>) . All commercial use on the web is
      largely to be considered in this light, that the meaning of
      each messaeg sent across the Internet is well-defined by a
      series of specifications.
    </p>
    <p>
      (A sense of grounding also can be appliyed seperately to
      different sorts of "understanding". When "understanding"
      means presentation to a human for human understanding, a
      presentation-grounded documents points to all information
      such as schemata and style sheets which will enable it to be
      presented.)
    </p>
    <h3>
      Grounding as a myth: the Web of Meaning
    </h3>
    <p>
      The concept of grounded documents is important for
      predicatble systems, but it is a bad model for the web -- or
      for life -- in the long run. Words in a <em>natural</em>
      langauge such as English are not grounded in a unique base
      set<a href="#Grounding">*</a>. Every time you look one up in
      the dictionary all you find are more words. The world is
      web-like, and any attempt by the Web to constrain it to be
      tree-like is bound to force a misrepresentation of realtity.
      This is the Wittgenstein view of meaning. Understanding this
      view sometimes confuses people about the very systematic way
      in which meaning in Internet protocols is defined by layers
      and layers of specs.
    </p>
    <p>
      In fact, the two views both apply, one nested inside the
      other. Yes, meaning is use - but in the Internet protocols,
      society has set up social constraints - laws and other
      expectations - which constrain use to be according to the
      specs. This is a social constraint which your computer is
      under when you use the Internet, just as when you fill out a
      tax form you don't have a choice as to how to interpret the
      meaning of "Adjusted Gross Income on line 39 of a US IRS form
      1040". There is a whole department of the government which
      defines what it is and which socially owns the term. So while
      the
    </p>
    <p>
      What will change with the Semantic Web's development is that
      its grounding in legacy systems will fade into history. Right
      now, the meaning of "Invoice total vale" is effectively
      defined by the software which you plug your RDF document
      into, and how it treats invoices. This is an important way to
      bootstrap the semantic web with useful terms. That will
      become less important as many different software poducts
      share teh same term. In the end, it is weblike form which
      will characterize the semantic web. Everyone will be defining
      things in terms of other things which they feel are useful
      and stable enough. It will be impossible to insist that there
      be a global ordering between more basic and less basic
      specifications -- and to do so would stop the web scaling. No
      one will agree on a directed <em>acyclic</em> graph
      determining what terms are "more basic" than others. For any
      set of definitions in one direction, there can always be some
      reverse definitions which can be seen by others as just as
      valid.
    </p>
    <p>
      So, while the concept of documents grounded in a given base
      set is important for interoperability, it must not be seen as
      a goal to force the semantic web into an acyclic structure.
      There will be no single Dewy decimal system for the semantic
      web. The concepts of well-defined stable specifications will
      still be essential. So will respect for the definitions of
      terms. The difference will be that any one will chose their
      own set of langauges they consider "basic", and find ways of
      defining other languages they come across in terms of those.
      A rich web of conversions, translations will grow up to
      support this. The web of trust will provdie tools for
      navigating within and selecting from this web in a safe way.
      And of course, global standarsdw il wlways make like much
      easier where they can be made.
    </p>
    <h3>
      FAQ: Surely meaning is only defined by use?
    </h3>
    <p>
      <em>This is all very well</em>, runs a popular line,
      <em>except that to talk about "meaning" at all is basically
      bogus</em>. <em>The meaning of words, and therefore
      languages, is defined by use - by how people actuall respond
      to them, by how they are processed. Surely the only way I can
      guarantee that someone will interpret a document in a
      particular way is to have some out-of-band agreement with
      them first?</em>
    </p>
    <p>
      Philosophically, it is indeed the case that you need some
      out-of-band (not in the message itself) agreement. In real
      life, though, in fact there a lot of widely-held agreements.
      In fact, the law is a set of agreements which you are deemed
      to accept whether you formally agree or not. So when you are
      sent a tax form, you can't argue that the language of the tax
      form is not one you interpret in that way. they just stick
      you in jail.
    </p>
    <p>
      The web works like one big agreement. By connecting your
      computer to it and getting email from POP and IMAP ports,
      there is an understanding that what you get are MIME
      messages, and the same thing when you pick up web page using
      HTTP. So by using the web you are entering a world where the
      assumption can be made that messages are to be interpreted by
      a set of specifications. the specifications are (currently)
      generally written in english, and imperfect, but basically
      debate about them is practically about details, not aboutteh
      philosophy as to whether they apply. So that is why one can
      in practice talk about meaning.
    </p>
    <h3>
      FAQ: Doesn't the meaning of a document depend on its context?
    </h3>
    <p>
      Of course it does. If i exclose a phtocopy of a document as
      an attachment, it doesn't mean I am sending you that letter.
    </p>
    <p>
      However, theer are a lot of contexts for a document which
      have the same implication for the meaning of that document.
      Publication, by email to a public list, or HTTP, or FTP, or
      printing on paper and nailing to a tree, in each case leaves
      the meaning of a document defined in the same way. These
      contexts, in which a document is published by a party, or a
      message converyed from one party to another, are so common
      and basic that the meaning of the document in these contexts
      is referred to simply as the meaning of the document (or
      message).
    </p>
    <p>
      The webarchitecture separately enumerates the ways in which
      these contexts actually work under he hood (publication using
      HTTP, etc) and teh way documents are interpreted and dealt
      with once published. That way, XML langauegs don't ahve to
      keep referring to "meaning when received with a 200 code in
      HTTP".
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 06 Jan 1997 00:00:00 GMT</pubDate>
  <title>Metadata Architecture</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Metadata.html</link>
    <guid>https://www.w3.org/DesignIssues/Metadata.html</guid>
      <description><![CDATA[
    <h1>
      Metadata Architecture
    </h1>
    <h4 id="Preface">
      Preface
    </h4>
    <p>
      <em>This document was written before the Semantic Web
      Roadmap, but is an introduction to the same ideas. Both
      introduce the world of machine-readable data on the web. This
      document introduces the concepts in the historical sequence
      at W3C, where the first driving applications of semantic web
      were metadat, and the first driving metadata applications
      were endorsement labels (<a href="#PICS">PICS</a>)</em>.
    </p>
    <h2>
      Documents, Metadata, and Links<br>
    </h2>
    <p>
      The thing which you get when you follow a link, when you
      de-reference a URI, has a lot of names. Formally we call it a
      <b>resource</b>. Sometimes it is referred to as a document
      because many of the things currently on the Web are human
      readable documents. Sometimes it is referred to as an object
      when the object is something which is more machine readable
      in nature or has hidden state. I will use the words document
      and resource interchangeably in what follows and sometimes
      may slip into using "object".
    </p>
    <p>
      One of the characteristics of the World Wide Web is that
      resources, when you retrieve them, do not stand simply by
      themselves without explanation, but there is information
      about the resource. Information about information is
      generally known as <b>Metadata</b>. Specifically, in the web
      design,
    </p>
    <h4>
      Definition
    </h4>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            Metadata is machine understandable information about
            web resources or other things
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      The phrase "machine understandable" is key. &nbsp;We are
      talking here about information which software agents can use
      in order to make life easier for us, ensure we obey our
      principles, the law, check that we can trust what we are
      doing, and make everything work more smoothly and rapidly.
      Metadata has well defined semantics and structure.
    </p>
    <p>
      Metadata was called "Metadata" because it started life, and
      is currently still chiefly, information about web resources,
      so data about data. &nbsp;In the future, when the metadata
      languages and engines are more developed, it should also form
      a strong basis for a web of machine understandable
      information about anything: about the people, things,
      concepts and ideas. &nbsp;We keep this fact in our minds in
      the design, even though the first step is to make a system
      for information about information.
    </p>
    <p>
      For an example of metadata, when an object is retrieved using
      the HTTP protocol, the protocol allows information about its
      date, its expiry date, its owner, and other arbitrary
      information to be sent by the server. The world of the World
      Wide Web is therefore a world of information and some of that
      information is information about information. In order to
      have a coherent picture of this, we need a few axioms about
      metadata. The first axiom is that :
    </p>
    <h4>
      Axiom
    </h4>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            metadata is data.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      That is to say, information about information is to be
      counted in all respects as information. There are various
      parts of this.
    </p>
    <p>
      One is that metadata can be stored regarded as data, it can
      be stored in a resource. So, one resource may contain
      information about itself or about another resource. In
      current practice on the World Wide Web there are three ways
      in which one gets metadata. The first is the data about a
      document contained within the document itself, for example in
      the HEAD part of an HTML documents or within word processor
      documents. The second is that during the HTTP transfer the
      server transfers some metadata to the client about the object
      which is being transferred. This, during an http GET, is
      transferred from the server to the client and, during a PUT
      or a POST, is transferred from the client to the server. One
      of the things which we have to rationalize in our
      architecture of the World Wide Web is who exactly is making
      the statement. Whose statement, whose property is that
      metadata. The third way in which metadata is found is when it
      is looked up in another document. This practice has not been
      very common until the PICS initiative was to define label
      formats specifically for representing information about World
      Wide Web resources. The PICS architecture specifically allows
      for PICS labels which are resources about other resources to
      be buried within the resource itself, to be retrieved as
      separate resources, or to be passed over during the http
      transaction. To conclude,
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            Metadata about one document can occur within the
            document, or within a separate document, or it may be
            transferred accompanying the document.<br>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      Put another way, metadata can be a first class object.
    </p>
    <p>
      The second part of the above axiom is:
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            Metadata can describe metadata
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      That is, metadata itself may have attributes such as
      ownership and an expiry date, and so there is meta-metadata
      but we don't distinguish many levels, we just say that
      metadata is data and that from that it follows that it can
      have other data about itself. This gives the Web a certain
      consistency.
    </p>
    <h2>
      The Form of Metadata<br>
    </h2>
    <p>
      Metadata consists of assertions about data, and such
      assertions typically, when represented in computer systems,
      take the form of a name or type of assertion and a set of
      parameters, just as in the natural language a sentence takes
      the form of a verb and a subject, an object and various
      clauses.
    </p>
    <h4>
      <a name="independent" id="independent">Axiom</a>
    </h4>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            The architecture is of metadata represented as a set of
            independent assertions.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      This model implies that in general, two assertions about the
      same resource can stand alone and independently. When they
      are grouped together in one place, the combined assertion is
      simply the sum (actually the logical AND) of the independent
      ones. Therefore (because AND is commutative) collections of
      assertions are essentially unordered sets. This design
      decision rules out for example, in simple sets of data,
      assertions which are somehow cumulative or later ones
      override earlier ones. Each assertion stands independently of
      others.
    </p>
    <p>
      We will see below how logical expressions are formed to
      combine assertions in more varied ways, and syntactic rules
      which allow the subject at least of the assertion to be made
      implicit. But neither of these change the basic operation of
      combining assertions in unordered AND lists.
    </p>
    <h3>
      <a name="Attributes" id="Attributes">Attributes</a>
    </h3>
    <p>
      Assertions about resources are often referred to as
      attributes of the resource. That is, the type of assertion is
      an assertion that the object, the resource in question, has a
      particular named property such as it's author, and in that
      case the parameter is the name or identity of the author.
      Similarly, if the attribute is the document's date of expiry
      then the parameter is that date.
    </p>
    <p>
      Often, a group of assertions about the same resource occur
      together, in which case the syntax generally omits the URI of
      that resource as it is implicit. In these cases, when it is
      clear from the context about which resource the assertion is
      being made, the assertion often takes the form of a list of
      attributes and values. In RFC822 format messages, such as
      mail messages and HTTP messages, metadata is transferred
      where the attribute name is an RFC822 header name and the
      rest of the RFC822 line is the value of the attribute, such
      as Date: and From: and To: information. The attribute value
      pair model is that used by most activities defining the
      semantics of metadata today.<br>
    </p>
    <p>
      I use the word "assertion" to emphasize the fact that the
      attribute value pair when it is transferred is a statement
      made by some party. It does not simply and directly imply
      that the resource at any given time has that value for the
      given attribute. It must be seen as a statement by a
      particular party with or without implicit or explicit
      guarantees as to validity. Throughout the World Wide Web, as
      trust becomes an important issue, it will be important for
      software -- and people -- to keep track of and take into
      account who said what in terms of data and metadata. So, our
      model of data of a resource is something about which
      typically we know the creator or the person responsible, and
      typically the date of which the information was created,
      which implies, in the case of a piece of information which
      makes an assertion, the date at which the assertion was made.
    </p>
    <p>
      An assertion
    </p>
    <blockquote>
      (A u1, p, q...)
    </blockquote>
    <p>
      typically has as explicit parameters,
    </p>
    <ul>
      <li>the URI of the resource about which the assertion is made
      (u1).
      </li>
      <li>some identifier (A) for the type of assertion being made,
      such as author or date or expiry date.
      </li>
      <li>other parameters (p, q,...) according to the type of
      assertion.
      </li>
    </ul>
    <p>
      As implicit or explicit or implicit parameters,
    </p>
    <ul>
      <li>The party making the assertion
      </li>
      <li>The date/time of the assertion
      </li>
      <li>etc...
      </li>
    </ul>
    <p>
      We can often make an analogy with programming languages. An
      assertion in metadata can be compared with a function call in
      a programing language. In object oriented languages, the
      object of the function has a special place among the
      parameters just as the subject of an assertion does in
      metadata. In object oriented languages, though, the set of
      possible functions depends on the object, whereas in metadata
      the set of assertion types is more or less unlimited, defined
      by independent choice of vocabulary. <em>Anyone can say
      anything about anything</em>.
    </p>
    <h3>
      A space for attribute names
    </h3>
    <p>
      It is appropriate for the Web architecture to define like
      this the topology and the general concepts of links and
      metadata. What about the significance of individual
      relationships? Sometimes, as above, these are special,
      defined in the architecture, and having an architectural
      significance or a significance to the protocols. In other
      cases, the significance of relationships or indeed of
      attributes is part of other specifications, other design, or
      other applications, and must be defined easily by third
      parties. Therefore, the set of such relationship and
      attributes names must be extremely easily extensible and
      therefore extensible in a decentralized manner. This is why
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            the URL space is an appropriate space for the
            definition of attribute names.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      We have already (1997) several vocabularies of attribute
      names: for example, the HTML elements which can occur within
      the HEAD element, or as another example, the headers in an
      HTTP request which specify attributes of the object. These
      are defined within the scope of particular specifications.
      There is always pressure to extend these specifications in a
      flexible way. HTTP header names are generally extended
      arbitrarily by those doing experiments. The same can also be
      true of HTML elements and extension mechanisms have been
      proposed for both. If we look generically at the very wide
      space of all such metadata attribute names, we find something
      in which the dictionary would be so large that ad hoc
      arbitrary extension would be just as chaotic as central
      registration would be stifling.
    </p>
    <blockquote>
      <b>Aside: Comparison with Entity-Relationship models</b>.
      <p>
        This architecture, in which the assertion identifier is
        taken from (basically) URL space differs from the
        "Entity-relationship" (ER) model and many similar models
        like it, including most object-oriented programming
        systems. In an ER model, typically every object is typed
        and the type of an object defines the attributes can have,
        and therefore the assertions which are being made about it.
        Once a person is defined as having a name, address and
        phone number, then the schema has to be altered or a new
        derived type of person must be introduced before one can
        make assertions about the race, color or credit card number
        of a person. The scope of the attribute name is the entity
        type, just as in OOP the scope of a method name is an
        object type (or interface)By contrast, in the web, the
        hypertext link allows statements of new forms to be made
        about any object, even though (before anything other than
        syntax checking) this may lead to nonsense or paradox. One
        can define a property "coolness" within one's own part of
        the web, and then make statements about the "coolness" of
        any object on the web.
      </p>
      <p>
        This design difference is in essence a resurfacing of the
        decision to make links mondirectional, sacrificing
        consistency for scalability.
      </p>
      <p>
        An advantage of ER systems is that they allow one to work,
        in the user interface for example, with a set of properties
        which "should" be defined for each entity. You can define
        these in the Metadata's predicate calculus by defining an
        expression for a "well specified" object. ("For all
        <i>X</i> such that <i>X</i> is a customer <i>X</i> is
        well-specified if there exists <i>n</i> such that <i>n</i>
        is the name of <i>X</i> and there exists <i>t</i> such that
        <i>t</i> is the telephone number of <i>X</i> and...)
      </p>
      <p>
        end of aside.
      </p>
    </blockquote>
    <h3>
      <a name="MetadataHeaders" id="MetadataHeaders">Metadata
      ("Entity") headers in HTTP</a>
    </h3>
    <p>
      In the above it is important to realize that the HTTP headers
      which contain what can be considered as metadata ("entity
      headers") should be separated quite distinctly from HTTP
      headers which do not. HTTP headers which contain metadata
      contain information which can follow the document around. For
      example, it is reasonable for a cache to pass such
      information on without treatment, it is reasonable for
      clients or other programs which process data to store those
      headers as metadata with the document for later processing.
      The content of those headers do not have to be associated
      with that particular HTTP transaction. By contrast, the
      RFC822 headers in HTTP which deal specifically with the
      transaction or deal specifically with the TCP link between
      the two application programs have a shorter scope and can
      only be regarded as parameters of the HTTP method. To make
      this separation clear will be to make it easier not only to
      understand HTTP and how it should be processed, it will also
      make it clear which pieces of HTTP can be used easily and
      transparently by other protocols which may use different
      methods with different parameters. The clarification of the
      architecture of HTTP such that both the metadata and the
      methods can be extended into other domains is an important
      part of the work of the World Wide Web Consortium. The
      Internet protocols SMTP and NNTP and HTTP as well as many new
      and proposed protocols share much of the semantics of the
      RFC822 headers. Formalizing the shared space and making it
      clear that there is a single design for a particular header,
      rather than four designs which are independent and happen to
      look very similar, requires a general architecture, some
      careful thought, and is essential for the future design of
      protocols. It will allow protocol design to happen in small
      groups which can take for granted the bulk of previous work
      and concentrate on independent new design.
    </p>
    <h4>
      Authorship of HTTP entity headers
    </h4>
    <p>
      It may be possible to remove or at least encompass the
      apparent anomaly of metadata transferred from an HTTP server
      by creating a special link type which links the document
      itself to the set of attributes which the server would give
      in the HTTP headers. In other words, the server would be able
      to say, "here is a document, here is some metadata about it,
      and the metadata about it has the following URL". This would
      allow one, for example, request a signed copy of the HTTP
      headers. It would allow one to ask about the intellectual
      property rights of those headers, and the authorship of those
      headers.
    </p>
    <p>
      It is important to be completely clear about the authorship
      of the HTTP headers. The server should be seen as a software
      agent acting on behalf of a party which is the publisher or
      document author: the definer of the URI to resource identity
      mapping. The webmaster is only an administrator who is
      responsible for ensuing that (through an appropriately
      configured server) the transactions on the wire faithfully
      represent the statements and wishes of that party.
    </p>
    <h2>
      Links<br>
    </h2>
    <p>
      An assertion of relationship between two resources is known
      as a <b>link</b>.
    </p>
    <p>
      In this case, it is a triple
    </p>
    <blockquote>
      (<i>A u1 u2</i>)
    </blockquote>
    <p>
      of:
    </p>
    <ul>
      <li>the type of assertion being made, that is, the
      relationship which is being asserted,
      </li>
      <li>the first URI,
      </li>
      <li>and the second URI.
      </li>
    </ul>
    <p>
      These sorts of assertions, links, are the basis of navigation
      in the World Wide Web; they can be used for building
      structure within the World Wide Web and also for creating a
      semantic Web which can express knowledge about the world
      itself. That is to say, links may be used both for the
      structure of data, in which case they are metadata, but also
      they may be used as a form of data.
    </p>
    <p>
      Links, like all metadata can be transferred in three ways.
      They can be embedded in a document, which is one end of the
      link, they can be transferred in an HTTP message, for example
      what is called the header of the document, and they can be
      stored in a third document. This latter method has not been
      used widely on the World Wide Web to date.
    </p>
    <h2>
      Goal: <a name="Self-descr" id="Self-descr">Self-describing
      information</a><br>
    </h2>
    <p>
      A critical part of the design of the whole system is the way
      that the semantics of metadata or indeed of data are defined.
      The semantics of metadata in our RFC822 headers in mail
      messages and in http messages are defined by hand in english
      in the specifications of those protocols. The PICS system
      takes this to one stage further in terms of flexibility by
      allowing a message to contain a pointer to the document which
      defines, in human readable terms, the semantics of each
      assertion made within a <a href="#PICS">PICS</a> label. In
      the future we would like to move toward a state in which any
      metadata or eventually any form of machine readable data
      carries a reference to the specification of the semantics of
      all the assertions made within it.
    </p>
    <p>
      For example, suppose that when a link is defined between two
      documents, the relationship which is being asserted is
      defined in a such way that it can be looked up on the World
      Wide Web (i.e. using some form of URI), and someone or some
      program, which has not come across that relationship before
      can follow the link and extend its understanding or
      functionality to take advantage of this new form of
      assertion.
    </p>
    <p>
      In the case of PICS, one can dynamically pick up a human
      readable definition of what that assertion really means. In
      PICS (and in theory in SGML using DTDs), one can also pick up
      a machine readable definition of what form that assertion can
      take, what syntax, what types of parameters it can take. This
      allows a human interface to a new PICS scheme to built on the
      fly. To go one step further, one could, given a suitable
      logic or knowledge representation language, pick up a machine
      readable definition of the semantics of that assertion in
      terms of other relationships.
    </p>
    <p>
      The advantages of such self describing information is that it
      allows development of new applications and new functionality
      independently by many groups across the web. Without
      self-describing information, development must wait for large
      companies or standards committees to meet and agree on the
      commonly agreed semantics.
    </p>
    <p>
      Of course a pragmatic way of extending software to handle new
      forms of information is to dynamically download the code to
      support a software object which can handle such data for one.
      Whereas this is a powerful technique, and one which will be
      used increasingly, it is not sufficient. It is not sufficient
      because one has to trust the implementation of the object,
      and the state.
    </p>
    <h4>
      Goal
    </h4>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            As much as possible of the syntax and semantics should
            be able to be acquired by reference from a metadata
            document.
          </td>
        </tr>
      </tbody>
    </table>
    <h3>
      Building Applications using Link Relationships
    </h3>
    <p>
      It turns out that a very large number of applications both
      built on top of the web and also built within the
      infrastructure of the Web can largely be built by defining
      new relationship types. Examples of these are the document
      versioning problem which can be largely solved by defining
      link values relating documents to previous and future
      versions and to lists of versions; intellectual property
      rights, distribution terms, and other labeling which can be
      solved by making a link from one document to the document
      containing the metadata.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 29 Jan 1998 00:00:00 GMT</pubDate>
  <title>The Web Model: Information hiding and URI syntax</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Model.html</link>
    <guid>https://www.w3.org/DesignIssues/Model.html</guid>
      <description><![CDATA[
    <h1>
      <a name="Model" id="Model">The Web Model</a>
    </h1>
    <p>
      The web is a very general concept -- one universal space of
      information. The concepts it requires such as identifiers and
      information resources (documents) are as general and abstract
      as possible. However, there have been some design decisions
      made which define some interfaces, and effectively define
      modules or agents which are independent. These agents are
      independent in many ways
    </p>
    <ul>
      <li>There is knowledge they have individually but do not
      share
      </li>
      <li>There is knowledge their designers had individually but
      did not share
      </li>
    </ul>
    <p>
      This is basic modularity. The interfaces are defined by the
      data formats and protocols, and the important features to
      understand about the design I have ranted about in the linked
      articles in this series. This modularity, ability for
      different parts of the system, shows up when different specs
      are independent, such that you could change one without
      having to change the other.
    </p>
    <h2>
      <a name="Resource" id="Resource">The Information Resource</a>
    </h2>
    <p>
      (Formerly, <a href="#Resource1">Resource</a>)
    </p>
    <p>
      This is the current term for a certain unit of information in
      the Web. In many cases on the current Web, thinking
      "document" will do. It is something which conveys
      information. The Web model is that information in the
      information space is in the abstract chunked into addressable
      things known as resources.
    </p>
    <p>
      In the technical architecture, resources have identifiers,
      Universal Resource Identifiers, and the properties of these
      identifiers are elaborated later. In fact the concept of a
      unit of information is central, not only in the technical
      architecture, but in society's concepts of information, as a
      document is not only the unit for reference, retrieval and
      presentation (typically), but also the unit of ownership,
      license to use, payment, confidentiality, endorsement, etc.
      So though technically we can derive such things as compound
      document, generic documents, and resources which look
      anything but the typical notion of a "document", we have to
      be able to support these social aspects of information at the
      same time, so we can't mess with it too much.
    </p>
    <h2>
      <a name="Fragement" id="Fragement">Fragment Id and "#"</a>
    </h2>
    <p>
      In the hypertext architecture, when making a reference, such
      as a hypertext link, we don't just refer to an information
      resource. Well, we can, but we can also refer to a particular
      part of or view of a resource. The string which, within the
      document, defines the other end of the link has two parts. It
      has the identifier of the document as a whole, and then
      optionally it has a hash sign "#" and a string representing
      the view of the object required. &nbsp;This suffix is called
      a fragment identifier. &nbsp;(Even though it doesn't
      represent necessarily a fragment of the document: it could
      represent how the document should be viewed.). The fragment
      identifier only has relevance in the context of the web page
      in question. This has an implication how the software is
      built. For example, An "access" module can be given just the
      bit of the URI without the fragment identifier. It gets the
      information, and creates a software object for the hypertext
      page. That object is passed the fragment identifier.
    </p>
    <p style="text-align: center;">
      <img src="ParseHash.png" style="height: 20em; max-width: 95%;" alt="The URI is split off at the hash into a fragement ID and the rest" border="0">
    </p>
    <p>
      In fact, analyzing the system a little more, the access
      function can be broken into the underlying access which
      creates the object by passing two things to some kind of
      object creator ("factory"): a data stream and a MIME type.
    </p>
    <h3>
      Generally
    </h3>
    <p>
      Hypertext is a specific application, but this principle works
      for other applications on the Web. In fact, when we discuss
      <a href="Webize">webizing</a> an application, we take some
      computer language, and we take what were document-global
      things, say global variables in a programming language, and
      make them truly global by appending the URI of the document
      and "#".
    </p>
    <p>
      Clearly, in different applications the fragment identifier
      will have completely different function. The independence
      here means that new applications (such as the Semantic Web)
      can be built, just like hypertext web, just by introducing
      new types of document.
    </p>
    <h2>
      Independence
    </h2>
    <p>
      The model of how the web works is that there are two separate
      functions. &nbsp;The part (blue in the picture) which
      accesses the document deals with its identifier, but does not
      know what view will be required. &nbsp;It creates some
      software object which represents and presents the resource.
      That object does not need to know how it was created
      (necessarily), and so does not need to know the URI it was
      identified by. However, it does know how to interpret the
      Fragment ID.
    </p>
    <p>
      So we have two axioms:
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            The access machinery does not need to look at the
            fragment ID.
          </td>
          <td></td>
        </tr>
      </tbody>
    </table>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            The presentation object does not need to know the URI
            of the resource
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      The equivalent axioms&nbsp;when we are talking about
      specifications amount to:
    </p>
    <table border="1" cellspacing="5" cellpadding="5">
      <tbody>
        <tr>
          <td>
            The specifications for access protocols are independent
            of the specifications for fragment identifiers.
          </td>
        </tr>
      </tbody>
    </table>
    <h3>
      Why?
    </h3>
    <p>
      For one thing, consider the special case of a link within a
      document. &nbsp;In this case, the link <b>only</b> specifies
      a fragment identifier. &nbsp;The object can follow the link
      itself. &nbsp;It doesn't have to consult the access code in
      order to figure out &nbsp;where the link goes to.
      &nbsp;Because the "#" syntax s universal to all access
      methods, the object can process the link internally.
      &nbsp;For a static HTML file, for example, this means that
      you can write and HTMl file with internal links without
      worrying or knowing about exactly what URIs the file will
      get. &nbsp;It means you don't have to alter the file if you
      chose to serve it in some new name or address space. &nbsp;If
      the "#" syntax was not a universal specification for the web,
      this would break: you couldn't do it. As Jim Gettys points
      out, as the era of digitally signed documents comes upon us,
      changing a signed document will break the signature on it. So
      allowing one to make a self-consistent document with internal
      links in a way independent of the namespace is even more
      essential.
    </p>
    <h3>
      Why else?
    </h3>
    <p>
      This independence is very important for the evolution of the
      Web. &nbsp;It means that people can go off and design all
      kinds of new systems for naming, addressing and accessing
      documents, without having to worry about what sort of
      documents will be moved. &nbsp;It means that people can go
      off and make new media types (MIME types), each of which can
      have different concepts for views and fragments, without
      having to talk to the people developing the access
      technology. This has already (1998) proved incredibly
      enabling to the community, as HTTP has advanced in parallel
      with many other ways of accessing data, and the number of
      exciting media types has grown very rapidly, and will be the
      key to many new revolutions built on top of the basic Web
      idea.&nbsp;
    </p>
    <p>
      If you look at the diagram you ill notice how the fragment
      IDs are generated by and understood by just the one module.
      &nbsp;You see how, when designing a new MIME type, one is
      quite free to be creative in making new and powerful forms of
      fragment ID, knowing hat no other specifications will refer
      to them, and nothing else will break.
    </p>
    <h2>
      Document sets and relative addressing
    </h2>
    <p>
      Now let us look at what happens when we follow a link.
      &nbsp;For example, say a hypertext page is clicked on.
      &nbsp;The page has a representation of the end point of the
      link. &nbsp;It hands it to the application. &nbsp;In fact,
      often, there are links between pages whose URIs are very
      similar and only differ in the right hand part. &nbsp;This
      isn't true of all name spaces: for example, when making links
      between news articles identifies by the news id (news:foo)
      unique ID, you have to specify the whole thing. However, if
      you restrict publication of a set of documents to a
      hierarchical name or address space, then you can arrange for
      documents which are very related and have many links to be in
      the same part of the tree.
    </p>
    <p>
      In this case, the links between these documents are "relative
      URIs".
    </p>
    <p>
      What happens then is that the relative URI, which only has
      the locally different part of the URI in it, is handed back
      to what in the diagram I have called the "application", to be
      turned into an absolute URI by being combined with the
      absolute URI of the resource, which the application has
      remembered.
    </p>
    <p>
      Note that the application is aware of the absolute URI but
      still the resource does not have to.
    </p>
    <p>
      Note that the fragment id is still circulated around a loop
      between the object (green) which understands it and the
      applications (yellow) which handles it transparently but does
      not understand or change it.
    </p>
    <p>
      Now there was a design decision that the application could
      have passed to the access module both the relative URI and
      the absolute URI. Then, different namespaces would have been
      able to have different algorithms for resolving a base URI
      and a relative URI into a new absolute URI. But the decision
      was made that the relative address format should be common
      across all name spaces.
    </p>
    <p style="text-align: center;">
      <img src="Parse2.png" style="height: 20em; max-width: 95%;" alt="The URI is split off at the hash into a fragement ID and the rest" border="0">
    </p>
    <h3>
      Why?
    </h3>
    <p>
      Just as we considered internal links above, now consider
      relative links between a bunch of documents, like the
      sections of a book, which are close in the tree. &nbsp;In
      practice, such document sets are moved from place to place,
      from file systems into HTTP space or FTP space, and because
      the relative address rules are universal, the documents do
      not have to be modified every time they are moved. (Yes, if
      you move half the set to one place and half to another, you
      have to fix links). &nbsp;This is happening all the time.
      &nbsp;People are creating and programs are generating
      hypertext with relative links without knowing or caring what
      absolute URI will be used to refer to the material.
    </p>
    <h2>
      The access scheme
    </h2>
    <p style="text-align: center;">
      <img src="Parse3.png" style="height: 20em; max-width: 95%;" alt="The URI is split off at the hash into a fragement ID and the rest" border="0">
    </p>
    <p>
      The so-called "access scheme" is the first part of the URI.
      As we have seen above, you don't have to know anything about
      it to parse relative URIs or to process the fragment
      identifier of a URI. The knowledge of particular schemes is
      limited to the "access" function (blue in the above diagram).
    </p>
    <p>
      The scheme is a very important flexibility point, and should
      not be abused. Anyone dereferencing a URI must have a
      knowledge of the scheme it uses.
    </p>
    <p>
      The access scheme defines a huge part of URI space. The
      scheme defines a subspace with particular properties
    </p>
    <p>
      The access scheme is <i>by definition</i> the highest point
      of flexibility. What does that mean? It means that if the
      whole Web develops problems which we cannot solve within the
      existing protocols, or if new spaces are designed which
      really can't be accessed through or mapped into existing
      spaces, then we can create a new space. We have faith that we
      will be able to use this flexibility point in the future,
      because it worked successfully for integrating the older
      spaces such as Gopher and FTP spaces into the Web.
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            If you have ported a concept between environments in
            the past, then there is a better hope that you can in
            the future.
          </td>
        </tr>
      </tbody>
    </table>
    <h3>
      The danger of too many access schemes
    </h3>
    <p>
      However, we do not do this lightly. When we introduce a new
      space, it may have very different properties and we expect
      that the deployment of new software will be needed to allow
      access to it. Some spaces may be gatewayable into HTTP space,
      and this will often provide a transition path. This is why
      early browsers allowed one to declare in a configuration file
      what gateways to use for what new spaces.
    </p>
    <p>
      If we use this extension point frivolously, ironically, it
      will cease to work. Suppose very many schemes are introduced.
      The access scheme space itself becomes a namespace with all
      the problems which current namespaces such as DNS are trying
      to solve, but which are very hard problems:
    </p>
    <ul>
      <li>Clashes in the namespace would destroy interoperability;
      </li>
      <li>Ownership of the space becomes commercially valuable;
      </li>
      <li>Democratic and fair management becomes essential and
      difficult;
      </li>
    </ul>
    <p>
      Worse, though, technology will be needed to automatically
      dereference the schemes themselves and download code to
      handle them. Something like DNS will be needed. The top level
      namespace then becomes in fact DNS, or something like it.
      This, however, begs the question. What happens if later DNS
      needs to be replaced? There is no top-level extension switch
      left. The world is stuck with whatever form of access-scheme
      name service exists.
    </p>
    <p>
      Therefore, I conclude that access schemes should not be open
      to trivial extension, and that the access scheme should only
      be extended by the introduction of new standards with full
      open review by the entire community.
    </p>
    <h3>
      Alternatives to new schemes
    </h3>
    <p>
      Whereas some schemes (like "data:") are clearly neat and new
      and orthogonal to HTTP, many schemes could in fact be
      integrated into http, using HTTP extension mechanisms.
    </p>
    <p>
      In fact, is HTTP is to be taken as a general computing
      protocol, then use of an <a href="Extensible.html">extensible
      language system</a> for the HTTP request message would allow
      a huge amount of extension, covering protocols with different
      functionality (exporting different interfaces).
    </p>
    <h3>
      Evolving scheme spaces
    </h3>
    <p>
      When considering the evolution of a space, it is important to
      remember that primarily the access scheme refers to a part of
      the URI space, and secondarily it refers to a protocol.
      Therefore, one can in fact change the protocols used to
      access resources within a scheme's namespace, without
      changing the space. For example, a new DNS protocol could be
      introduced which over time would replace the current one,
      without changing the DNS space. This would effectively
      redefine the HTTP and FTP protocols, but would not harm the
      namespaces. When touch-tone dialing was introduced, the
      telephone numbering system remained the same. So an indexing
      system could be introduced which, when deployed, would allow
      http:// space objects to be found with greater reliability or
      speed than the current protocols, while maintaining the HTTP
      space as being the concatenation of a DNS name and an opaque
      string.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Oct 2007 00:00:00 GMT</pubDate>
  <title>Modularity</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Modularity.html</link>
    <guid>https://www.w3.org/DesignIssues/Modularity.html</guid>
      <description><![CDATA[
  <h1>Modularity</h1>
  <h2>Simple things make firm foundations</h2>
  <p>You can look at the development of web technology in many
  ways, but one way is as a major software project. In software
  projects, the <a href="/TR/webarch/#orthogonal-specs">independence of specs</a>, has
  always been really important, <a href="/DesignIssues/Principles#Modular">I have felt</a>. A classic
  example is the independence of the HTTP and HTML specifications:
  you can introduce many forms of new markup language to the web
  through the MIME Content-Type system, without changing HTTP at
  all.</p>
  <p>The modularity of HTML itself has been discussed recently, for
  example by Ian Hickson, co-Editor of <a href="/html/wg/html5/">HTML5</a>:</p>
  <blockquote>
    <p>Note that it really isn't that easy. For example, the HTML
    parsing rules are deeply integrated with the handling of
    &lt;script&gt; elements, due to document.write(), and also are
    deeply integrated with the definition of innerHTML. Scripting,
    in turn, is deeply related to the concept of scripting
    contexts, which depends directly on the definition of the
    Window object and browsing contexts, which, in turn, are deeply
    linked with session history and the History object (which
    depends on the Location object) and with arbitrary browsing
    context navigation (which is related to hyperlinks and image
    maps) and its related algorithms (namely content sniffing and
    encoding detection, which, to complete the circle, is part of
    the HTML parsing algorithm). - <a href="http://lists.w3.org/Archives/Public/public-html/2007JanMar/0096.html">
    <em>Brainstorming test cases, issues and goals, etc.</em>,</a>
    Ian Hickson</p>
  </blockquote>
  <p>and in reply by Laurens Holst:</p>
  <blockquote>
    <p>I don't know the spec well enough to answer that question,
    but I'd say modularization (if I may call it so) would make it
    both easier to grasp as individual chunks, for both the
    reviewing process and the implementing process. <a href="http://lists.w3.org/Archives/Public/public-html/2007JanMar/0100.html">
    <em>brainstorming: test cases, issues, goals, etc.</em></a>. -
    Laurens Holst</p>
  </blockquote>
  <p>The &lt;canvas&gt; element introduces a complex <a href="/html/wg/html5/#the-2d">2D drawing API</a> different in nature
  from the other interfaces, which concentrate on setting and
  retrieving values in the markup itself; the <a href="/html/wg/html5/#sql">client-side database storage</a> section of
  the specification is another such interface. While the
  &lt;canvas&gt; element has a place in the specification, the
  drawing API should be defined in a separate document. Hixie
  <a href="/2002/09/wbs/40318/tactics-gapi-canvas/results#xq3">expressed</a>
  a similar sentiment (and see the group's <a href="/html/wg/tracker/products/2">issues about scope</a>):</p>
  <blockquote>
    <p>The actual 2D graphics context APIs probably should be split
    out on the long term, like many other parts of the spec. On the
    short term, if anyone actually is willing to edit this as a
    separate spec, there are much higher priority items that need
    splitting out and editing...</p>
  </blockquote>
  <p>It would also be nice if the &lt;canvas&gt; element and the
  SVG elements which embed in HTML did so in just the same way, in
  terms of the context (style, etc.) which is passed (or not
  passed) across the interface, in terms of the things an
  implementer has to learn about, and things which users have to
  learn about. So that &lt;canvas&gt; and SVG can be perhaps
  extended to include say 3D virtual reality later, and so that all
  of these can be plugged into other languages just as they are
  plugged into HTML.</p>
  <p>There are lots of reasons for modularity. The basic one is
  that one module can evolve or be replaced without affecting the
  others. If the interfaces are clean, and there are no side
  effects, then a developer can redesign a module without having to
  deeply understand the neighboring modules.</p>
  <p>The flip side is that a cleanly designed module designed as
  part of one system can be re-used in other systems.</p>
  <p>It is the independence of the technology which is important.
  This doesn't, of course, have to directly align with the
  boundaries of documents, but equally obviously it makes sense to
  have the different technologies in different documents so that
  they can be reviewed, edited, and implemented by different
  people.</p>
  <p>The web architecture should not be seen as a finished product,
  not as the final application. We must design for new applications
  to be built on top of it. There will be more modules to come,
  which we cannot imagine now. The Internet transport layer folks
  might regard the Web as an application of the Net, as it is, but
  always the Web design should be to make a continuing series of
  platforms each based on the last. This works well when each layer
  provides a simple interface to the next. The IP is simple, and so
  TCP can be powerfully built on top of it. The TCP layer has a
  simple byte stream interface, and so powerful; protocols like
  HTTP can be built on top of it. The HTTP layer provides,
  basically, a simple mapping of URIs to representations: data and
  the metadata you need to interpret it. That mapping, which is the
  core of Web architecture, provides a simple interface on top of
  which a variety of systems -- hypertext, data, scripting and so
  on -- can be built.</p>
  <p>So we should always be looking to make a clean system with an
  interface ready to be used by a system which hasn't yet been
  invented. We should expect there to be many developers to come
  who will want to use the platform without looking under the hood.
  Clean interfaces give you invariants, which developers use as
  foundations of the next layer. Messy interfaces introduce
  complexity which we may later regret.</p>
  <p>Let us try, as we make new technology, or plan a path for old
  technology, always to keep things as clean as we can.</p>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Mar 2002 00:00:00 GMT</pubDate>
  <title>Design alternatives considered in Notation3</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/N3Alternatives.html</link>
    <guid>https://www.w3.org/DesignIssues/N3Alternatives.html</guid>
      <description><![CDATA[
    <h1>
      Alternative design choices in <a href="Notation3.html">Notation3</a>
    </h1>
    <p>
      In this article:
    </p>
    <ol>
      <li>
        <a href="#Syntax">Syntax for Graph traversal</a> ("paths")
      </li>
      <li>
        <a href="#Infix">Infix operators</a>
      </li>
      <li>
        <a href="#Sets">Syntax for sets</a>
      </li>
      <li>
        <a href="#Considered">Other issues</a>
      </li>
    </ol>
    <h2>
      <a name="Syntax" id="Syntax">Syntax for graph traversal</a>
    </h2>
    <p>
      There is a strong need for a neat syntax for converting an
      expression for x into an expression for something removed one
      step along the graph from x by an arc of type (rdf:Property)
      p. For example, if x is a person then we want an expression
      for x's email address. <em>(I am dropping the prefixes in
      this discussion to reduce clutter)</em>
    </p>
    <p>
      "Neat"? Compact, powerful, simple, naturally understandable
      because of metaphors with existing use of similar syntax.
    </p>
    <p>
      Strictly, we are talking about <em>some y, such that
      p(x,y)</em>, or in n3, [is p of x]. There is no implication
      in this syntax at the moment (but could be later) that there
      is only one such y. The information that there can be only
      one such y, when it is so, is conventionally in stored by
      noting that p is a daml:uniqueProperty property. This can be
      stated in any document, though current colloquial use puts it
      into the schema for p.
    </p>
    <p>
      I will call moving from x to [ is p of x] forward traversal,
      and moving from x to [p x] backward traversal. My instinct is
      that forward traversal, which is the only thing you can do
      naturally in many systems of linked objects, is more common
      need in the language than backward traversal.
    </p>
    <p>
      Backward traversal can also be expressed as forward traversal
      through the inverse of a property, so a compact expression
      for the inverse of a property would be an alternative, so
      long it was clear when this was syntactic device for making a
      backward link, and when(if ever) it was actually used to make
    </p>
    <p>
      We need both to chose punctuation and also the grammar, as to
      the precedence of the operator if any. To be able to write
      "the person whose wife's uncle is driving my bother's car" .
      Mostly here I am looking at traversal expressions going left
      to right with no precedence, but "of" as used in english is
      an exception in that it is right to left.
    </p>
    <h3>
      Use case examples
    </h3>
    <p>
      Forward traversal: The phone number of the home of the chair
      of the conference x,
    </p>
    <table border="1">
      <caption>
        Example scenarios
      </caption>
      <tbody>
        <tr>
          <td></td>
          <th>
            English
          </th>
          <th>
            Existing Notation3 (2002/02)
          </th>
        </tr>
        <tr>
          <td>
            <p>
              Forward traversal
            </p>
          </td>
          <td>
            The phone number of the home of the boss of x. X's
            boss' home's phone number.
          </td>
          <td>
            [ is :phone of [is :home of [is :boss of :x]]]
          </td>
        </tr>
        <tr>
          <td>
            Mixed traversal
          </td>
          <td>
            The phone number of the home of someone whose boss is
            the uncle of x.
          </td>
          <td>
            [is :phone of [is home of [ boss [is uncle of :x]]]]
          </td>
        </tr>
        <tr>
          <td>
            Units
          </td>
          <td>
            100 dollars.
          </td>
          <td>
            [dollars "100"]
          </td>
        </tr>
        <tr>
          <td>
            Units
          </td>
          <td>
            the price in dollars
          </td>
          <td>
            [ is dollars of price]
          </td>
        </tr>
        <tr>
          <td>
            Language
          </td>
          <td>
            The french phrase "chat"
          </td>
          <td>
            [ lang:fr "chat"]
          </td>
        </tr>
        <tr>
          <td>
            Language
          </td>
          <td>
            The title in french
          </td>
          <td>
            [ is lang:fr of :label]
          </td>
        </tr>
        <tr>
          <td>
            Mixed
          </td>
          <td>
            The author of the book whose title in english is "The
            Little Prince"
          </td>
          <td>
            [is author of [ has title [lang:en "The Little
            Prince"]]]
          </td>
        </tr>
        <tr>
          <td>
            Unary function
          </td>
          <td>
            The sine of x.
          </td>
          <td>
            [is sine of x]
          </td>
        </tr>
        <tr>
          <td>
            Nary function
          </td>
          <td>
            The maximum of 12, 23 and 20
          </td>
          <td>
            [is math:max of ("12" "23" "20")]
          </td>
        </tr>
        <tr class="not">
          <td>
            Nary function (named args) <strong>Not</strong> a
            traversal case
          </td>
          <td>
            The the result of spellchecking foo.html with
            dictionary eng.dict.
          </td>
          <td>
            [we:spellcheck &lt;foo.html&gt;; we:dictionary
            &lt;eng.dict&gt;.]
          </td>
        </tr>
        <tr>
          <td>
            Labeled traversal
          </td>
          <td>
            an sculture whith a price of x dollars and creator y
            domiciled in italy.
          </td>
          <td>
            [a Sculpture; cost [ dollars x]; creator [=y; domicile
            cc:it]] <em>This use of "=" is not real N3 syntax</em>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      The last case, labelled traversal, is in fact much more than
      graph traversal - by embedding variables into the graph in
      search template (rule antecedent), one make a reference which
      can be used in a rule conclusion. One can also, by reusing a
      variable more than once, make multiply connected (right
      phrase?) graph in place of a tree.
    </p>
    <h3>
      Dot
    </h3>
    <p>
      This problem has strong analogy with moving from an object to
      a slot in an object. Python, c++, etc: x.email, so that
      metaphor is a natural one to pick up.
    </p>
    <p>
      Pro: For programmers, this is a natural.
    </p>
    <p>
      Con: Dot as the end of an n3 sentence would have to be
      protected by following space or punctuation. The language is
      made more complex in that either some tricky tokenizing with
      some form of look-ahead becomes necessary.
    </p>
    <p>
      There is no equivalent convention as far as I know for
      backward traversal, so let's pick something random and
      inverse to "." -- say "^". (Metaphor: back up rather than
      down forwards?). Think of "^" as a combination of "." and an
      operator to generate the inverse property. (Or maybe "^"
      should be that property, which would make foo.^bar a
      back-traversal except that it would actually be represented
      using an extra triple.)
    </p>
    <h3>
      Bang
    </h3>
    <p>
      There is a form of path familiar to those who knew email and
      net news in the days of source routing: when one had to
      specify a series of machine names through which the mail had
      to be forwarded, as in
      <code>mcvax!cernvax!online!timbl</code>. Though few current
      users will remember it, it has the advantage over dot of
      being unused elsewhere in teh N3 syntax. This leaves the N3
      language simpler.
    </p>
    <table border="1">
      <caption>
        Example scenarios
      </caption>
      <tbody>
        <tr>
          <td></td>
          <th>
            English
          </th>
          <th>
            Using dot and caret, left to right
          </th>
          <td>
            Right to left parsing with $ and %
          </td>
          <th>
            Keywords, right to left
          </th>
        </tr>
        <tr>
          <td>
            Forward traversal
          </td>
          <td>
            The phone number of the home of the boss of x. X's
            boss' home's phone number.
          </td>
          <td>
            x.boss.home.email
          </td>
          <td></td>
          <td>
            email of home of boss of x
          </td>
        </tr>
        <tr>
          <td>
            Mixed traversal
          </td>
          <td>
            The phone number of the home of someone whose boss is
            the uncle of x.
          </td>
          <td>
            x.uncle^boss.home.email
          </td>
          <td></td>
          <td>
            email of home of thatwhich boss uncle of x
          </td>
        </tr>
        <tr>
          <td></td>
          <td>
            The formula from parsing a document whose URI is the
            first command line argument
          </td>
          <td>
            "1".os:argv^log:uri.log:semantics
          </td>
          <td>
            log:semeantics%log:uri$os:argv%"1"
          </td>
          <td>
            log:semantics of [] which has uri [] which is od:argv
            of "1"
          </td>
        </tr>
        <tr>
          <td>
            Units (b)
          </td>
          <td>
            100 dollars.
          </td>
          <td>
            "100"^dollars
          </td>
          <td>
            dollars$"100"
          </td>
          <td>
            thatwhich dollars 100
          </td>
        </tr>
        <tr>
          <td>
            Units (f)
          </td>
          <td>
            the price in dollars
          </td>
          <td>
            price.dollars
          </td>
          <td>
            dollars%price
          </td>
          <td>
            dollars of price
          </td>
        </tr>
        <tr>
          <td>
            Language (b)
          </td>
          <td>
            The french phrase "chat"
          </td>
          <td>
            "chat"^lang:fr
          </td>
          <td>
            langfr$"chat"
          </td>
          <td>
            thatwhich lang:fr "chat"
          </td>
        </tr>
        <tr>
          <td>
            Language (f)
          </td>
          <td>
            The title in french
          </td>
          <td>
            title.lang:fr
          </td>
          <td>
            lan:fr%title
          </td>
          <td>
            lang:fr of title
          </td>
        </tr>
        <tr>
          <td>
            Mixed
          </td>
          <td>
            The author of (the book) whose title is the french "Le
            Petit Prince
          </td>
          <td>
            "Le Petit Prince"^lang:fr^doc:title.author
          </td>
          <td>
            author % doc:title $ lang:fr $"Le Petit Prince"
          </td>
          <td>
            author of thatwhich <em>has</em> title thatwhich
            <em>has</em> lang:fr "Le Petit Prince"
          </td>
        </tr>
        <tr>
          <td>
            Unary function
          </td>
          <td>
            The sine of x. sin(x)
          </td>
          <td>
            x.sin
          </td>
          <td>
            sin%x
          </td>
          <td>
            sin of x
            <p>
              x's sin
            </p>
          </td>
        </tr>
        <tr>
          <td>
            (its inverse)
          </td>
          <td>
            arcsin(x)
          </td>
          <td>
            y^sin
          </td>
          <td>
            sin$y
          </td>
          <td>
            thatwhich sin y
          </td>
        </tr>
        <tr>
          <td>
            N-ary function
          </td>
          <td>
            The maximum of 12, 23 and 20
          </td>
          <td>
            ("12" "23" "20").max
          </td>
          <td>
            max$("12" "23" "20")
          </td>
          <td>
            max of ("12" "23" "20")
          </td>
        </tr>
        <tr>
          <td>
            Labeled traversal
          </td>
          <td>
            A sculpture with a price of x dollars and creator y
            domiciled in italy.
          </td>
          <td>
            [a Sculpture; cost :x^dollars; creator [is y; domicile
            cc:it]]
            <p>
              <em>This use is not real N3 syntax unless we change
              "is'</em>
            </p>
          </td>
          <td>
            domicile$
          </td>
          <td>
            [] a Sculpture; cost [] dollars 100; creator :y which
            :domicile cc:it. <em>Note that a consistent grammar is
            not obvious</em>
          </td>
        </tr>
      </tbody>
    </table>
    <h3>
      <a name="L7286" id="L7286">Multiply, Divide</a>
    </h3>
    <p>
      Metaphor: Units of measure
    </p>
    <p>
      A snappy syntax is useful in the leaves of an expression
      tree,. Examples come up frequently when the logical way to
      express data types, units of measure, and so on is with a
      graph traversal. With units of measure, people use use
      multiplication and division, and these actually make sense
      mathematically.
    </p>
    <p>
      Cost = 100*dollars or even Cost/dollars = 100 and
      Cost/day=100*dollars.
    </p>
    <p>
      Pro: / and * are indeed inverse, when you have unique and
      unambiguous functions: x/y*y =x.
    </p>
    <p>
      Con: This is not always the case! Also, "*" and "/" in math,
      and in units of measure, have properties like commutativity
      which you expect of "*" and it doesn't have in this context/.
      Also, I had expected that it would be pragmatic to add in
      operators directly to the syntax for convenience, and so was
      reserving <em>+ - * /</em>.
    </p>
    <h3>
      <a name="Keywords" id="Keywords">Keywords</a> - which, of,
      's, the
    </h3>
    <p>
      The english language suggests some keywords.
    </p>
    <p>
      "which" I have considered using in a sentence to turn the
      current object into the new subject. There are two forms I
      had thought of, I'll call them "which" and "thatwhich" for
      now. "Which", as in english, applies to a started object and
      allows labelled traversal. "thatWhich" is used for backward
      traversal, though the grammar is different.
    </p>
    <p>
      <code>:joe :son :johnny which has :girlfriend :jane.</code>
    </p>
    <p>
      <code>:joe :son thatWhich :girlfriend :jane.</code>
    </p>
    <p>
      <code>thatwhich has :home thatwhich has :email thatwhich
      has</code>
    </p>
    <p>
      Pro: <em>which</em> reads very well (unless you insist on
      <em>whose</em>!), especially with N3's optional <em>has</em>
      before the property.
    </p>
    <p>
      Con: <em>thatwhich</em> is unbeliveably ugly. Even
      <em>which</em>, while reading well, is not a very concise
      form.
    </p>
    <p>
      A possibility is to just use <em>which</em>, with [] for the
      <em>that</em> or <em>something</em> which precedes it in
      english grammar. In fact, if someone wants <em>something</em>
      as a synonym of [] I wouldn't violently object.
    </p>
    <p>
      <code>:joe :son [] which :girlfriend :jane.</code>
    </p>
    <p>
      A synonym for "which" could be the more mathematical
      "suchthat", which suggests a vertical bar.
    </p>
    <p>
      <code>:joe :son [] | :girlfriend :jane.</code>
    </p>
    <p>
      This makes an effective traversal operator []| which is an
      eyeful, but the pipe is nice as a connector.
    </p>
    <p>
      <code>joe son :johnny | girlfriend jane | mother [] | email
      &lt;audey@example.com&gt;.</code>
    </p>
    <p>
      "Of" is interesting, though could be confusing that it parses
      right to left
    </p>
    <p>
      <code>email of home of boss of x</code> means <code>email of
      (home of (boss of x))</code>
    </p>
    <p>
      I just noticed that when I write on the blackboard, % and
      <em>of</em> look pretty similar, so % to be read as
      <em>of</em> would a possibility for forward traversal prefix
      operator.
    </p>
    <p>
      The astute will have noticed that "of" is already used as a
      keyword in N3. However, all is not lost, in fact much could
      be gained. Could one not split "of" and "is" into separate
      features of the language, <code>p of y</code> being simply
      short for what is currently <code>[ is p of y]</code>, and
      <code>is</code> being an operator which at the syntactic
      level indicates that two things are the same node.
    </p>
    <p class="detail">
      (This is not the same as N3's =, which is daml:equivalentTo,
      which has axioms about properties of similar things being the
      same, but is not involved at this level. N3 and RDF treat
      different URI-identified nodes separately, whether or not a
      daml:equivalentTo arc joins them))
    </p>
    <p>
      This allows things like
    </p>
    <p>
      <code>joe brother [ is fred; wife margy; kids jane,
      john]</code>.
    </p>
    <p>
      Contrast "of" with with the english 's, German -es
    </p>
    <p>
      <code>x's boss's home's email</code> meaning (<code>(x's
      boss)'s home)'s email</code>
    </p>
    <p>
      which reminds one of Ada's
    </p>
    <p>
      <code>x'boss'home'email</code>
    </p>
    <p>
      of whose etymology I am unaware.
    </p>
    <p>
      Con: I was kinda thinking of keeping all the quotes I can in
      hand for use in various forms of quotation! So many languages
      needs many forms of quotation and run out of options all to
      fast. (XML an Python both use " and ' to mean the same - a
      waste if you ask me!)
    </p>
    <p>
      One could go the other way and just use a keyword "s"
    </p>
    <p>
      <code>x s boss s home s email.</code>
    </p>
    <p>
      or use a "$" with a closeness to "'s" and expectation of
      being read aloud as such:
    </p>
    <p>
      x$boss$home$email
    </p>
    <p>
      "The" in english signifies the uniqueness of something, and
      so could be used to indicate that something is indeed a
      function.
    </p>
    <p>
      the email of the home of the boss of x
    </p>
    <h3>
      <a name="Arrows" id="Arrows">Arrows</a>
    </h3>
    <p>
      Access limited logic, and the original N3 design, one of the
      conceptual graph serializations, and other languages derived
      from a transcription of whiteboard circles-and-arrows
      diagrams, use "-&gt;" or "&gt;" as a traversal operator.
      Multics used (I understand) "&gt;" for descent of a directory
      tree and "&lt;" for ascent, so ../../foo/test would be
      &lt;&lt;foo&gt;test which is neat even though it frightens
      the xml-minded side of one.
    </p>
    <p>
      N3 uses &lt;&gt; to surround URIs, which i suppose could be
      changed, but it interferes strongly with this.
    </p>
    <h3>
      <a name="Slashes" id="Slashes">Slashes</a>
    </h3>
    <p>
      Same idea as arrows, but using slash.
    </p>
    <p>
      Pro: The metaphor with directory traversal is useful (even
      though the graph being traversed is not always a tree).
    </p>
    <p>
      Pro: A nice simplicity.
    </p>
    <p>
      <code>x.uncle^boss.home.email</code>
    </p>
    <p>
      becomes
    </p>
    <p>
      <code>x/uncle\boss/home/email</code>
    </p>
    <p>
      Con: Unix types could find it strange when finding their
      universal escaping character used as anything else. And it
      rules our using it for that form.
    </p>
    <p>
      Con: The confusion which Microsoft introduce by using
      backslashes for directories has done lasting harm to the
      community, leaving many people still unsure which is which.
      This sort of
    </p>
    <h3>
      <a name="Parens" id="Parens">Parens</a>
    </h3>
    <p>
      The application of a monadic function is a special case of
      the traversal of a graph arc, so syntactic metaphors from
      functions would seem appropriate. The most obvious case is
      when a function takes a list, to just abut the function
      identifier to the list, looking like a regular function call
      in more languages than I could name:
    </p>
    <p>
      <code>x = math:max(y z w)</code> for <code>x = [ is math:max
      of (y z w)]</code>
    </p>
    <p>
      Pro: Looks great.
    </p>
    <p>
      Con: Doesn't work when the function doesn't take a list.
      Also, if you get a space in between, it means something
      completely different. Hopefully it will in some cases at
      least be a syntax error, but not within in a list.
    </p>
    <p>
      Maybe a separator of some sort as punctuation would work a
      left/right reversed from of "."
    </p>
    <p>
      <code>x = math:max$(y z w)</code>
    </p>
    <h3 id="Summarizin">
      Summarizing
    </h3>
    <table border="1">
      <caption>
        Categorizing
      </caption>
      <tbody>
        <tr>
          <td></td>
          <td>
            Forward traversal
          </td>
          <td>
            Backward traversal
          </td>
        </tr>
        <tr>
          <td>
            suffix
          </td>
          <td>
            x.email
            <p>
              x's email
            </p>
          </td>
          <td>
            y^email
          </td>
        </tr>
        <tr>
          <td>
            prefix
          </td>
          <td>
            email of x
            <p>
              email(x)
            </p>
          </td>
          <td>
            [] which email y
            <p>
              [] | email y
            </p>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      One thing that becomes evident: it can be really difficult to
      read the backward traversal in english. Like many systems,
      (including WWW) , english is optimized for forward traversal
    </p>
    <h3>
      Swan
    </h3>
    <p>
      Sandro's swan language used a name immediately followed by
      "(" as a function opener as in sum(2 3).
    </p>
    <p>
      He also used "." for path traversal.
    </p>
    <h2>
      <a name="Infix" id="Infix">Infix operators</a>
    </h2>
    <p>
      I had reserved * / + - for infix operators for arithmetic.
      The | operator for or and &amp; for and (or union and
      intersection of sets) are also reasonable to use in this way.
    </p>
    <p>
      If N3 is to have a to have a path toward becoming a language
      in which arithmetic and set operations are easy to write, it
      is hard to improve on infix notation. This would, however,
      change the form of the language significantly. It isn't clear
      that it would still be predictively parsable.
    </p>
    <h2 id="Sets">
      Sets
    </h2>
    <p>
      <a name="following" id="following">The following</a>
      considers design alternatives in extending N3 to include a
      notation for set literals. 2005/1/1
    </p>
    <h3>
      Background on containers
    </h3>
    <p>
      In the area of containers, RDF started with some "Sequences"
      and "Bags" which were in my opinion and with the benefit of
      hindsight, sub-optimal (The infinite rdf:_1 series of
      predicates was downright weird, and taking it into
      consideration made code much mroe complicated. Futher, for
      all the arbitrary complexity of the rdf:_nnn predictaes, they
      didn't tell you that essential bit of information as to when
      the container was finished: what <strong>wasn't</strong> in
      the container) .
    </p>
    <p>
      RDF does however have a <strong>collection</strong> which is
      an ordered list, and is very useful. N3 has a shorthand
      syntax ( 1 2 3 ) for the list of the numbers 1, 2 and 3, and
      the RDF/XML syntax has parseType="collection" shorthand.
      There is also defined a way of expressing lists in triples
      using blank nodes, using <code>rdf:first</code> and
      <code>rdf:rest</code>, and <code>rdf:nil</code>. The list 1 2
      3 would be expressed as
    </p>
    <pre>[ rdf:first 1; rdf:rest [<br>     rdf:first 2; rdf:rest [<br>        rdf:first 3; rdf:rest rdf:nil]]]
</pre>
    <p>
      This is, if you like, a reification of a list. It described
      it totally. Some RDF systems actually store lists in this
      way. The RDF and OWL specs together are not (as far as I was
      aware) very clear about the axioms of lists. One would expect
      clear axioms that all lists exist, that any two lists with
      the same first and rest are owl:equivalent, and so on.
    </p>
    <h3>
      Introducing sets
    </h3>
    <p>
      It turns out that in many cases in applications we have seen,
      containers are in fact logically unordered sets, not ordered
      lists. Whether it is mail addresses on a mailing list, or
      rows in a database, or statements in an N3 formula, the order
      is immaterial, and something can occur in the set once or not
      at all.
    </p>
    <p>
      In these circumstances to use a list to represent the data is
      suboptimal in may ways. For example,
    </p>
    <ul>
      <li>It is not clear when two different lists actually have
      the same members in a different order that they represent the
      same set;
      </li>
      <li>The information about what is in fact a set end up being
      communicated out of band, or just assuemd by those who know
      the application;
      </li>
      <li>Underlying implementations cannot use code library
      support which is optimized for sets.
      </li>
    </ul>
    <p>
      For these reasons it is useful to have sets in the language
      in the same way as lists: to have a reification - a way of
      expressing them in triples so as to be able to pass them
      though general RDF applications whcih may be unaware of them,
      and a shorthand syntax to allow them to be written
      effeciently.
    </p>
    <h3>
      Reification
    </h3>
    <p>
      It turns out that OWL provides is with owl:oneOf, a
      relationship between a class and a list, such that the class
      is the class of things which are members of the list. Unless
      for some reason one wants to make sets different from
      classes, it seems appropriate to use classes for sets, and
      furthermore to use owl:oneOf as the constructor which allows
      us to specify a specific set in terms of an arbotrary
      ordering of its contents. The set of numbers 1,2 and 3 would
      then be written as
    </p>
    <pre>[ owl:oneOf (1 2 3)] 
</pre>
    <p>
      or, to elaborate it down to triples:
    </p>
    <pre>[ owl:oneOf <br>  [ rdf:first 1; rdf:rest [<br>     rdf:first 2; rdf:rest [<br>        rdf:first 3; rdf:rest rdf:nil]]]]
</pre>
    <p>
      Of course, any reification of a set whish lists the same
      members in a different order describes the same set.
    </p>
    <h3 id="Syntax1">
      Syntax
    </h3>
    <p>
      This is the more difficult choice! Here is a table of
      suggested syntax extensions to N3 for sets.
    </p>
    <table border="1">
      <caption>
        Syntax extensions suggested for sets in N3
      </caption>
      <tbody>
        <tr>
          <td>
            Syntax
          </td>
          <td>
            Advantages
          </td>
          <td>
            Disadvantages
          </td>
        </tr>
        <tr>
          <td>
            (1, 2, 3)
          </td>
          <td>
            Miniumal encroachment on to new punctuation.<br>
            Comma becomes a marker for lack or ordering. This is
            consistent with an object list.
          </td>
          <td>
            Parser has to look ahead a whole expression to know
            which it is dealing with: major change.
          </td>
        </tr>
        <tr>
          <td>
            ($ 1 2 3 $)
          </td>
          <td>
            "S" stands for "set". Otherwise just like lists.
          </td>
          <td></td>
        </tr>
        <tr>
          <td>
            {$ 1 2 3 $}
          </td>
          <td>
            "S" stands for "set". Curly braces are conventional for
            sets. Curly braces are used for formulae, which are
            also unordered.
          </td>
          <td>
            Curly is used for formulae, which are not normal
            collections
          </td>
        </tr>
        <tr style="color: rgb(0, 0, 0); background-color: rgb(250, 250, 250);">
        <td>
            {$ 1, 2, 3 $}
          </td>
          <td>
            "S" stands for "set". Curly braces are conventional for
            sets. Curly braces are used for formulae, which are
            also unordered.<br>
            Comma becomes a marker for lack or ordering. This is
            consistent with an object list.
          </td>
          <td>
            Curly is used for formulae, which are not normal
            collections.
          </td>
        </tr>
        <tr>
          <td>
            {* 1, 2, 3 *}
          </td>
          <td>
            "S" stands for "set". Curly braces are conventional for
            sets. Curly braces are used for formulae, which are
            also unordered.<br>
            Comma becomes a marker for lack or ordering. This is
            consistent with an object list.
          </td>
          <td>
            Curly is used for formulae, which are not normal
            collections. Asterisk could be used as infix operator,
            though not with .
          </td>
        </tr>
        <tr>
          <td>
            {, 1, 2, 3 }
          </td>
          <td>
            Curly braces are conventional for sets. Curly braces
            are used for formulae, which are also unordered.
          </td>
          <td>
            Curly is used for formulae, which are not normal
            collections. Weird and unconventional to start with a
            comma
          </td>
        </tr>
        <tr>
          <td>
            @Set{1, 2, 3}
          </td>
          <td>
            Just a new keyword, no extra syntax.
          </td>
          <td>
            d.
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      The current choice is {$ 1, 2, 3 $} which is conventional
      mathematical set notiation, plus dollar signs to distinguish
      a set from a formula.
    </p>
    <p>
      An interetsing possibility pointed out by Sandro Hawke is to
      actually make sets and formulas examples of the same thing. A
      formula is just a set: a set of statements. This makes
      statements first class objects. This is inherently appealing
      in its symmetry. However, as there is no statment opener
      syntax, only the closer (".", and effectively ";" and ","),
      there is no way for the parser to know in advance whether a
      statment or set is being parsed. This would not be the end of
      the world, but makes life more difficult. Futher, the current
      syntax alows an empty property list, so [ a :Deciduous, :Pine
      ]. is valid N3. This means that { :x } is a valid statment
      (with no triples), which would overlap with set syntax.
    </p>
    <h3>
      <a name="Disjoint" id="Disjoint">Disjoint</a> sets?
    </h3>
    <p>
      There is an issue as to whether {$ :a , :b, :c $} imlies that
      a, b and c are distinct. There was a <a href="http://rdfig.xmlhack.com/2005/01/26/2005-01-26.html#1106763446.326655">
      discussion</a> of this in the SWIG.
    </p>
    <p>
      If sets are disjoint:
    </p>
    <ul>
      <li>You can say how many members are in a set.
      </li>
      <li>You cannot form the union or intersection of two sets
      unless all the members involved are known to be disjoint, (or
      one knows whcih ones are equivalent), for example if one
      knows that they each are members of a larger set.
      </li>
      <li>In applications where the assumption is that a set is
      disjoint, the system can check and trap an errro if two
      members turn out to be the same.
      </li>
      <li>Cwm in smush mode, --mode=e, when it takes into account
      equality, would probably remove dupliactes from sets but
      there would be no signifince.
      </li>
    </ul>
    <p>
      If sets are not disjoint:
    </p>
    <ul>
      <li>You don't know how many members they have, in general.
      </li>
      <li>You can do set union, but not intersection.
      </li>
      <li>You can validly handle sets where you don't actually know
      how mny distinct (people say) there are.
      </li>
      <li>Cwm in smush mode, --mode=e, when it takes into account
      equality, would on loading a new equality, in some cases
      reduce the number of members of a set mentioned in the
      knowledge base.
      </li>
    </ul>
    <p>
      One possibility is to build into the processor that in a mode
      in whcih it is aware of equality it also tracks disjointness,
      for example using inverse functional properties and
      functional properties with numeric ranges.
    </p>
    <p>
      Of course, where all the members of a set are vlues of a
      datatype which provides a binary equality operator, such as
      integers, this is not a problem.
    </p>
    <h2>
      <a name="Considered" id="Considered">Considered design
      alternatives in other areas</a>
    </h2>
    <p>
      (older)
    </p>
    <ol>
      <li>Using : for &gt;- and -&gt; so that the propertylist
      looks like a list of attributes. Advantages: really human
      readable. Disadvantage: keep "=" as an operator. Also, I
      don't like "=" being used for something which is not
      equality. It is ingrained as a binary reflexive operator and
      it would be confusing to use it in attribute attribution.
      Alternative alternative: use ":" for both "-&gt;" and
      "-&lt;".
      </li>
      <li>Use/allow keyword "has" for &gt;- and "is" for &lt;-.
      Maybe, if still unambiguous, allow "of" for both "-&gt;" and
      "-&lt;". And/or use colon instead of "of". These assume that
      the english words people pick as properties are noun clauses.
      I actually preferred the use of verb clauses for what is in
      fact a verb. I used to prefer "wrttenBy" to "author". Now I
      have found the role-noun form much better.
      </li>
      <li>Making the subject of the propertylist, be another
      property. (Say, "ref"). This is like Henrik's SOAP-RDF
      mapping. Every statement has to become an anonymous node
      syntax example: [ &gt;- core:ref -&gt; [ &gt;- x:firstname
      -&gt; "Ora" ] ; &gt;- dc:wrote -&gt; [ &gt;- dc:title -&gt;
      "Moby Dick" ]]. The thing becomes a binary rather than
      ternary syntax so we should use binary syntax. Using -&lt;
      and -&gt; only (omitting the &gt;- and &lt;- ) example would
      be
        <p>
          [ core:ref : [ x:firstname : "Ora" ] ;
        </p>
        <p>
          dc:wrote : [ dc:title : "Moby Dick" ]
        </p>
        <p>
          ]
        </p>
        <p>
          or equally well
        </p>
        <p>
          [ x:firstname : "Ora" ;
        </p>
        <p>
          dc:wrote : [ dc:title : "Moby Dick" ]
        </p>
        <p>
          ]
        </p>
        <p>
          We need better examples, requiring explicit reference to
          the subject by URI.
        </p>
      </li>
      <li>Allowing well-formed XML element as object. reserve
      &lt;alpha for this? What does XML infoset look like expressed
      in RDF in notation3? decide: don't do it. Burdens notation3
      compiler with XML parser weight.
      </li>
      <li>Use &lt;&gt; for URIs instead of ' - DanC. Hmmmm I wanted
      to keep &lt;&gt; for other things maybe like string
      delimiters. Actually it is cool to use inverse &lt;. for
      stings &gt;this is a string&lt; because then you end up being
      able to make pages which look like markup and which are
      functions in notation3.
      </li>
      <li>Bind vs @prefix. Bind was a directive which declared a
      namespace with an implicit "#" between the namespace and the
      local name. This has many advantages: it meant that by
      looking at a URIref one could separate it unambiguously into
      namespace URI and fragment ID. This in turn meant one could
      dereference the namespace URI to get a schema or other
      information describing the namespace. However, this is not
      standard RDF. Nevertheless, the use of namespaces ending in
      "#" is recommended, as then the items in the name space can
      be easily described by a single document associated with the
      namespace identifier.
      </li>
      <li>Whitespace: &nbsp;what about unicode NL? This was
      included as one &nbsp;of teh few changes which happened in
      XML as it changed fro 1.0 to 1.1 . NL is a C1 control
      character which was introduced to allow the EBCDIC newline
      character to eb encoded. &nbsp;Why should one have a separate
      NL from the LF which CCITT defined all those years ago as the
      code to be used when newline (CR LF together) was required?
      </li>
    </ol>
    <h2>
      Fodder
    </h2>
    <p>
      Connolly points out: "This grammar starts to look a lot like
      the formalized english/conceptual grammar stuff. &gt;
      http://meganesia.int.gu.edu.au/~phmartin/WebKB/doc/grammars/
      &gt;
      http://www8.org/w8-papers/3b-web-doc/embedding/embedding.html
    </p>
    <p>
      Philippe Martin says, "Given the similarities of your
      Notation 3 with the (currently) more readable and expressive
      Frame-CG notation (FCG) that I designed 2 years ago and that
      is one of the notations used in my large-scale knowledge
      server <a href="http://www.webkb.org/">WebKB-2</a> , you
      might want to have a look at some executable <a href="http://www.webkb.org/doc/webkb2OntologicalExamples.html">example
      files</a> (e.g. ) and at the <a href="http://www.webkb.org/doc/F_languages.html#FCG">grammar</a>.
      The wide range of "quantifiers" is especially useful. You are
      welcome to copy any part of the FCG grammar into your
      Notation 3. (email 2001/09/17)
    </p>
    <h2 id="Footnote">
      Footnote
    </h2>
    <h3 id="Thought">
      Thought process behind implicit definition
    </h3>
    <p>
      How does one label a node in notation 3 for incomming
      reference? (The quivalent of "rdf:id=")? How about a property
      "Thought process behind implicit definition How does one
      label a node in notation 3 for incomming reference? (The
      quivalent of "rdf:id=")? How about a property "is hereby
      defined to be" with a suitable shorthand? One can then refer
      to such as thing internally as '#foo' which is a bit messy
      but not bad. You can't have keywords and identifiers both
      using that precious status of pure alphanumerics unless you
      reserve keywords. [ &gt;- n:def -&gt; '#ora' ; &gt;-
      x:firstname -&gt; "Ora" ] . [ '#ora' &gt;- dc:wrote-&gt; [
      &gt;- dc:title -&gt; "Moby Dick" ] ] . [ &gt;- x:firstname
      -&gt; "Laura" ] &lt;- x:hasChild-&lt; '#ora' . or equally
      well [ &gt;- n:def -&gt; '#ora' ; &gt;- x:firstname -&gt;
      "Ora" ] . [ '#ora' &gt;- dc:wrote-&gt; [ &gt;- dc:title -&gt;
      "Moby Dick" ] ] . [ &gt;- x:firstname -&gt; "Laura" ] &lt;-
      x:hasChild-&lt; '#ora' . Ah. Now consider what is the
      difference betwen reference and definition? I conclude there
      is none, as both are the assertion that the resource in
      question is identified by a URI. In the statements: [ &gt;-
      n:def -&gt; '#ora' ; &gt;- x:firstname -&gt; "Ora" ] . [
      '#ora' &gt;- x:lastname -&gt; "Lassila" ] . is there any
      significance that the node '#ora' is defined to be one which
      has firstname "ora" and lastname "Lassila" whichever way one
      looks at it. I would therefore propose that the use of a new
      local symbol :foo or '#foo' is taken as introducing it, but
      the definition of it by the document is really the whole web
      of statements which involve it. In fact, it maybe rather
      difficult to talk about the definition of it as distinct from
      the document, as as it is always best to avoid extra
      concepts, I won't. The above examples should just be,
      therefore, [ '#ora' &gt;- x:firstname -&gt; "Ora" ] . [
      '#ora' &gt;- x:lastname -&gt; "Lassila" ] isn't that
      simpler?.is hereby defined to be" with a suitable shorthand?
    </p>
    <p>
      One can then refer to such as thing internally as '#foo'
      which is a bit messy but not bad. You can't have keywords and
      identifiers both using that precious status of pure
      alphanumerics unless you reserve keywords.
    </p>
    <p>
      [ &gt;- n:def -&gt; '#ora' ; &gt;- x:firstname -&gt; "Ora" ]
      .
    </p>
    <p>
      [ '#ora' &gt;- dc:wrote-&gt; [ &gt;- dc:title -&gt; "Moby
      Dick" ] ] .
    </p>
    <p>
      [ &gt;- x:firstname -&gt; "Laura" ] &lt;- x:hasChild-&lt;
      '#ora' .
    </p>
    <p>
      or equally well
    </p>
    <p>
      [ &gt;- n:def -&gt; '#ora' ; &gt;- x:firstname -&gt; "Ora" ]
      .
    </p>
    <p>
      [ '#ora' &gt;- dc:wrote-&gt; [ &gt;- dc:title -&gt; "Moby
      Dick" ] ] .
    </p>
    <p>
      [ &gt;- x:firstname -&gt; "Laura" ] &lt;- x:hasChild-&lt;
      '#ora' .
    </p>
    <p>
      Ah. Now consider what is the difference betwen reference and
      definition? I conclude there is none, as both are the
      assertion that the resource in question is identified by a
      URI. In the statements:
    </p>
    <p>
      [ &gt;- n:def -&gt; '#ora' ; &gt;- x:firstname -&gt; "Ora" ]
      .
    </p>
    <p>
      [ '#ora' &gt;- x:lastname -&gt; "Lassila" ] .
    </p>
    <p>
      is there any significance that the node '#ora' is defined to
      be one which has firstname "ora" and lastname "Lassila"
      whichever way one looks at it. I would therefore propose that
      the use of a new local symbol :foo or '#foo' is taken as
      introducing it, but the definition of it by the document is
      really the whole web of statements which involve it. In fact,
      it maybe rather difficult to talk about the definition of it
      as distinct from the document, as as it is always best to
      avoid extra concepts, I won't.
    </p>
    <p>
      The above examples should just be, therefore,
    </p>
    <p>
      [ '#ora' &gt;- x:firstname -&gt; "Ora" ] .
    </p>
    <p>
      [ '#ora' &gt;- x:lastname -&gt; "Lassila" ]
    </p>
    <p>
      isn't that simpler?.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 19 Dec 1996 00:00:00 GMT</pubDate>
  <title>The Myth of Names and Addresses</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/NameMyth.html</link>
    <guid>https://www.w3.org/DesignIssues/NameMyth.html</guid>
      <description><![CDATA[
    <h1>
      The Myth of Names and Addresses<br>
    </h1>
    <p>
      The discussion above about the universality of URIs
      (Universal Resource Identifiers) mentions briefly how URIs
      are designed to encompass both things we think of as
      addresses and those we think of as names. Much of the
      discussion of this issue has been clouded by attempts to
      distinguish names from addresses. The term "identifier" was
      picked in an attempt to side-step this issue but
      historically, that did not prevent a quagmire of circular
      discussion which in some circles paralyzed any forward
      progress. Therefore, in this section let me state the
      philosophy which to my mind sets this problem in the right
      light and should prevent further fruitless discussion.
      <i><br></i>
    </p>
    <p>
      There is the commonly held belief that names and addresses
      are different and distinct. We learn the importance of the
      difference between identifiers in a programming language and
      addresses within a computer memory. We learn the difference
      in properties between fully qualified domain names on the
      internet and internet protocol addresses. This can lead us
      easily into imagining that there are two types of objects:
      Names, which once attached to an object follow it for its
      life wherever it should reside, and "addresses" which change
      frequently whenever an object moves or is copied or
      replicated from one "location" to another.
    </p>
    <p>
      However, the only true location is a point in three
      dimensional space, and within computer systems and especially
      networked computer systems there is a very large number of
      complex indirection between almost anything we would call a
      name <i>or</i> an address and the actual physical location of
      the memory cell which stores it. At one end of the spectrum a
      computer memory address often is really an address within a
      virtual memory space allocated to a particular project, and
      when used is translated by the hardware into a physical
      memory address, or for that matter into an address, into a
      piece of memory which is being moved out into somewhere and
      swapping the file on disk storage. Filenames are mapped
      though mount tables and directory files into "inodes" which
      are mapped onto track and sector locations. Internet protocol
      addresses [IP Addresses] similarly are not bound absolutely
      to a given computer: they can be re-allocated within the
      constraints that because they are used for routing, there is
      information connecting parts of the IP address with routing
      information and so the computer corresponding to a given IP
      address cannot be moved far in the routing structure. So, we
      see that the constraint on how you can re-use an address is a
      function of what information is in the address. When most
      programs or people mention IP addresses, they simply quote
      four decimal numbers, each between naught and 255 without
      worrying about the internal structure. So, the information
      within the IP address which prevents it being re-used in a
      different area is to most people not explicit: It is, if you
      like, hidden within there as the reason why IP addresses
      can't be used. When we want to use something to refer to a
      computer but still be able to move the computer or at least
      the thing corresponding to that identification across from
      one part of the internet to another, we use our domain name.
      The domain name system, being completely independent of the
      routing system, allows us to allocate any IP address at all
      to a computer of a given domain name. Therefore, if we
      believe the naming myth the domain name is a name and the IP
      address is truly an address.
    </p>
    <h2>
      <a name="Anecdotes" id="Anecdotes">Two anecdotes about names
      and addresses</a>
    </h2>
    <p>
      Two real-life anecdotes illustrate the dangers of making this
      assumption. When there were only a few web servers and I kept
      a registry of all those which I knew, I was contacted by a
      group in Australia who were putting up a server with some
      interesting botanical information. They sent me some details
      of the server to be put into the list and they gave me the IP
      address of the machine. My email reply explained that I
      always prefer to refer to servers by their domain name rather
      than their IP address and asked them for the domain name of
      the server. They replied that the domain name they would use
      would depend on the department within the university which
      was responsible for maintaining the server but due to a
      university re-organization, it was not at this point clear
      which department that would be. However, they explained that
      they could guarantee that the IP address of the server would
      remain unchanged for a long time.
    </p>
    <p>
      Several years later, the list of servers now abandoned as a
      single list of all World Wide Web servers was among the
      now-extensive web of information maintained on the server
      known as info.cern.ch, the first World Wide Web server set up
      at the start of the World Wide Web project. At this time the
      responsibility for the coordination of World Wide Web
      protocols was shifting from CERN to MIT/LCS and the embryonic
      World Wide Web Consortium. For a while, CERN continued to
      maintain the server, but later the master sources for that
      information were maintained in America. Soon after this the
      authorities at CERN requested that the name info.cern.ch
      should no longer be used to refer to this information, as it
      was no longer under control of CERN and they could no longer
      assume responsibility for it. In fact, there was a policy
      that names in the cern.ch domain should never be allowed to
      refer to Internet addresses which were not physically on the
      CERN site. Therefore all hypertext pointers into the
      info.cern.ch space have had to be changed over the course of
      time to point to the <code>w3.org</code> space.
    </p>
    <p>
      These two examples show the "name" of objects having to be
      changed even though the objects retained their essential
      identity. The reason was in each case imbedded information in
      the name: the domain name on the server contains authority
      information about the maintainer of the computer whose
      address corresponds to the domain name. If the authority for
      an object changes, whether it "moves" on not, then there may
      be a need to change its name under these circumstances. It
      turns out that for almost any naming or addressing system in
      which there is some information (other than random numbers or
      dates of creation of the objects) built into the name that
      the name might have to be changed when the facts
      corresponding to that information change. Therefore it
      becomes simply a matter of choice between naming or
      addressing systems as to what sort of information you wish to
      include implicitly or explicitly within your "name" or
      "address".
    </p>
    <h2>
      <a name="Why" id="Why">Why Names Change</a><br>
    </h2>
    <p>
      <small>See also:</small>
    </p>
    <ul>
      <li>
        <small>In the Syyle Guide for Online Hypertext, <a href="../Provider/Style/Overview.html"><i>Cool URLs don't
        change</i></a></small>
      </li>
    </ul>
    <p>
      It is worth looking at some of the reasons for names in
      practical use to change or need to be changed. Some World
      Wide Web servers have unwisely simply mapped the URL space
      onto a Unix filename space, and the results of this,
      especially in the early days, were URLs which might look like
      this:
    </p>
    <p>
      http://pegasus.cs.foo.edu/disk1/students/romeo/cool/latest/readthis.html
    </p>
    <p>
      Looking at the segments of this name we can see as many
      reasons for the name to need to be changed.
    </p>
    <p>
      The "http:" will only be changed if the document is later
      served up using a different protocol and, in fact, that is
      probably one of the least likely pieces to change.
    </p>
    <p>
      "Pegasus", the name of the computer, probably has a
      significance within the university as a computer dedicated to
      some particular tasks such as supporting personal student
      activities, and maybe maintained by a particular department
      or may even be a name from a project for which the computer
      was originally put into use before it became shared with
      general user space. So, "pegasus" will be changed whenever
      the function of supporting this particular student's web
      pages has to be shared with other functions.
    </p>
    <p>
      "Cs" indicates the computer science department, so the
      document is bound to the computer science department. It may
      not be something which the computer science department has a
      lot of interest in, and the student may well transfer his or
      her interests to other departments in the future.
    </p>
    <p>
      The name of the university "foo.edu" will probably last for a
      good while, though whether the university wants to continue
      to be associated with the document for more than two or three
      years is questionable.
    </p>
    <p>
      The next section of the path, "disk1", is clearly a mistake.
      In fact, of course, disc1 is just a name which can be
      attached to any physical disk, but by grouping together all
      the students on a certain disk in this arbitrary way, one
      makes a binding between all the documents which they create
      which will have to be broken whenever the computer is
      reorganized. In fact, the relocation tables which most
      servers support allow much translation of names to take place
      and make this sort of path quite unnecessary.
    </p>
    <p>
      The next element identifies Romeo as a student which may
      change even though he continues to study for the rest of his
      life, and then the next path element "romeo" identifies the
      author of the document. As in the case with CERN above, the
      original author of a document may later not wish to keep
      maintenance or responsibility for ongoing versions. For
      example, the document may be submitted to an organization
      which publishes it and formally takes over responsibility for
      its upkeep; it may achieve a status of some kind as a
      standard or an accepted thesis which causes its maintainers
      to change. The original author may in fact deliberately
      simply pass on authorship of the document to someone else. In
      any of these cases the name would have to change, and all
      references to that name would break.
    </p>
    <p>
      The student himself has not been very wise with his choice of
      path name. For many people, what is "cool" changes with time
      and for most people what is "latest" changes with time.
    </p>
    <p>
      Perhaps the unlikely to change piece of information in the
      URL "readthis" as it contains no information at all, just
      like the proverbial "click here". Effectively, it is a random
      name assigned to the document and as such, is perhaps the
      safest part of the path.
    </p>
    <p>
      The last element of the path, "html" is not strictly
      necessary with most servers, as at least some servers will,
      given a URL of &nbsp;"readthis" , &nbsp;serve up the data
      from a file which is called "readthis.html". Here the student
      is making it difficult for himself later to change the format
      or formats in which the file is available, without at least
      some confusion. Suppose, for example, that he later decides
      that the information is worth providing in audio format for
      blind readers. The CERN server can easily be configured so
      that clients specifically&nbsp;requesting audio formats in
      preference to HTML can be served as preferentially whereas
      more normal clients will get the HTML. So, here again is a
      part of the path which may be later regretted.
    </p>
    <p>
      You can play this game with almost any name and address in
      any system, and it is interesting to ask yourself in each
      case: to what extent do I call this a "name" and to what
      extend do I call it an "address"? So, in conclusion we see
      that any information explicitly owned or implicitly included
      in a name is a threat to its longevity. &nbsp;We see
      &nbsp;that the difference between a "name" and an "address"
      is not so fundamental. &nbsp;That is why
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            When a new URI scheme is defined, the specification
            defining ity should describe the name-like and
            address-like properties of URIs in the new scheme, so
            that that those using them can know what to be able to
            expect.
          </td>
        </tr>
      </tbody>
    </table>
    <h2>
      <a name="What" id="What">What's in a name?</a><br>
    </h2>
    <p>
      Why is information included then? Generally, the information
      is included because in order to discover anything about the
      name, one has to "dereference" the name. Typically this uses
      some official or unofficial set of indexes distributed or
      otherwise to look up the name. Many names are hierarchical in
      the authority which allocates them. DNS names are a good
      example. Road names within towns are another good example.
      Therefore to find out where the new "North Street" is located
      in small town one goes to the town for the definitive answer.
      For information as to where the server "pegasus.cs.foo.edu"
      is, one must send a message directly or indirectly to a
      server controlled by the Foo University.
    </p>
    <p>
      Is it possible to omit all such information from a name?
      Certainly. Message identifiers in mail have only the need to
      be unique. So, whereas hierarchical names and time stamps may
      be used to help make such identifiers unique, you cannot
      dereference the names at all. Perhaps we should call these
      "identifiers" rather than "names". Within a certain context,
      it is extremely useful to be able to refer to a mail message
      by its mail identifier. We say that these identifiers support
      the notion of equality: even though they cannot be
      dereferenced, you can test two mail messages to find out
      whether they are in fact the same simply by testing their
      identifiers. You can also within a finite set of mail
      messages look up a message of a given identifier. You just
      can't do this on a global scale. So this then is the essence
      of the naming problem:<br>
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            The naming problem: if you put information in a name,
            it decreases its longevity; if you don't you can't
            dereference it to a resource.
          </td>
        </tr>
      </tbody>
    </table>
    <h3>
      <a name="social" id="social">Naming: A social and contracual
      Issue</a>
    </h3>
    <p>
      Many, many solutions to the naming problem have been
      attempted and successfully deployed in different
      circumstances. At one end of the scale, it would be in fact
      possible using a huge network of hash tables around the
      world, to keep a hash index of all randomly generated unique
      names. The problem with this idea is that there would have to
      be one single funding model and one homogeneous quality of
      service for all names. There would be no way to pay more for
      a more persistent name.
    </p>
    <p>
      At the other end of the scale, hierarchical systems such as
      the domain name system, and the x500 name system, have been
      implemented. Suppose one wants to use a name which can be
      dereferenced and therefore must put some information in it.
      That information will lead us to some authority or some root
      to dereferencing the name. How can we maintain the lifetime
      of that name as something which can be dereferenced? The only
      way is that we have a contract with all the agencies which
      are involved in supporting the systems which dereference that
      name that they should continue their operation giving a
      certain quality of service for a certain period of time.
    </p>
    <p>
      Suppose the Foo Alumni Association ran a URL service in which
      a special name such as
      "http://alumni.foo.edu/1998/romeo/202-aab" would be available
      to any graduating paying their dues, and maintained
      indefinitely (perpetual care) on receipt of a suitable
      endowment.
    </p>
    <p>
      Of course, as organizations disolve and mutate, there is
      nothing to stop one organization from taking over the support
      of &nbsp;the archives another. &nbsp;Forthis purpose, it
      would be very useful to have a syntax for putting a date into
      a domain name. &nbsp;This would allow a system to find an
      archive server. &nbsp;Imaging that, failing to find
      "info.cern.ch", one could search back and find an entry
      "info.cern.ch.1994" which pointed to www.w3.org as a current
      server holding archive information for info.cern.ch as it was
      in 1994, with, of course, &nbsp;pointers to newer versions of
      the documents.
    </p>
    <h3>
      <a name="QoS" id="QoS">Quality of Service</a>
    </h3>
    <p>
      Looking at an "http:" URL, while some look more sensible than
      others, it is not immediately evident whether great pains are
      being taken to make the name very persistent. &nbsp;We have
      just discussed such a range of reasons why names can change,
      and clearly the social and contractual arrangements can be
      quite involved, so it is clearly difficult to simply define a
      quality of service for naming. &nbsp;However, defining some
      well known quality of service levels would be a very useful
      task. This is the sort of task ideally suited to a group of
      trechnologies, librraians or archivists.
    </p>
    <p>
      &nbsp;In any event, for identifiers in the http space and
      many others, it would be useful to be able to assert what the
      quality of service is. This is information about a URI and a
      resource. &nbsp;Like the <a href="Generic.html#Dimensions">information about generic URIs</a>,
      it is about the sort of identity between the URI and the
      resource.
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            Metadata should be used to express the quality of
            service for the binding between a URI and a resource.
          </td>
        </tr>
      </tbody>
    </table>
    <h2>
      <i><br></i>
    </h2>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2000 00:00:00 GMT</pubDate>
  <title>Dictionaries in the Library?</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/NamespacesAreResources.html</link>
    <guid>https://www.w3.org/DesignIssues/NamespacesAreResources.html</guid>
      <description><![CDATA[
    <h1>
      Dictionaries in the Library?
    </h1>
    <p>
      In his book <q>Goedel, Escher, Bach,</q> the computer
      scientist Douglas Hofstadter ruminates on self-referential
      systems. At times, he uses the approach of a Socratic
      dialogue between two characters from Xeno's fable,
      <q>Achilles and the Tortoise</q>. The conclusion of several
      hundred pages of musings around Bach's fugues, Escher's
      recusive drawings, and Goedel's theorem are that you can't
      try to distinuish <em>wishes</em> from <em>metawishes</em>,
      or the whole system breaks down. Without drawing too many
      parallels with the recent XML-URI discusssions, we would like
      to relate a conversaion between Achilles and the famous
      tortoise, recently overheard in a library.
    </p>
    <p>
      <em>[Achilles and the Tortoise are each strolling in the
      library. They meet.]</em>
    </p>
    <p class="a">
      Achilles: Ah, Mr. Tortoise, I thought I might find you in the
      library
    </p>
    <p class="t">
      T: And a very nice library it is too, Achilles.
    </p>
    <p class="a">
      A: Thank you. It was a communal effort. As were the books.
      There are so many really beautiful books in the library.
    </p>
    <p class="t">
      T: And now we have dictionaries!
    </p>
    <p class="a">
      A: Yes, dictionaries are very important to me, Mr.. Tortoise.
      I want to use them to understand what some of those books
      mean.
    </p>
    <p class="t">
      T: Let's not discuss meaning, please Achilles -- you know
      what happens when we do that! I want to use these
      dictionaries in order to check that the books are correct.
    </p>
    <p class="a">
      A: Well, at least we are agreed that dictionaries are a good
      idea.
    </p>
    <p>
      <em>[they round a corner]</em>
    </p>
    <p class="t">
      T: Achilles, what is that?!
    </p>
    <p class="a">
      A: Why, a dictionary, Mr. T.
    </p>
    <p class="t">
      T: But it is in the library! I thought when we defined
      dictionaries we agreed it was "not a goal" to register
      dictionaries in the library!
    </p>
    <p class="a">
      A: But surely that doesn't stop me putting one in the
      library?
    </p>
    <p class="t">
      T: Irony heaped on Irony! The Library is for books. That you
      should abuse it so! A dictionary is not a book. It is a
      metabook.
    </p>
    <p class="a">
      A: What? Of course it is book!
    </p>
    <p class="t">
      T: You said that you wanted it have the form of a book so we
      make them out of paper -- but that doesn't mean the intent
      was to put it in the library!
    </p>
    <p class="a">
      A: But this is my section of the library -- it is the section
      on Library Architecture and I need a dictionary to define the
      terms used in that field.
    </p>
    <p class="t">
      T: But you know that people can loose things in a library,
      and libraries can burn down ... there are so many reasons
      that dictionaries should <strong>not</strong> be in the in
      the library, Achilles!
    </p>
    <p class="a">
      A: Look at this way, Mr. Tortoise: when I am doing research
      in the library, I need to be able to look up words, and so I
      need a dictionary in the library.
    </p>
    <p class="t">
      T: You have some woolly notion of finding out what books
      mean, Achilles, but we haven't agreed about that. The meaning
      of the semantics of "meaning" are not a consensus in current
      linguistic epistemorthosemantisophologic theory.
    </p>
    <p class="a">
      A: I don't need to go into that, but I need a place for
      dictionaries.
    </p>
    <p class="t">
      T: Oh, we have all been discussing where dictionaries should
      go. We have plenty of ideas: We have plans for a new vault
      building down the road much more secure than this library. We
      have that white tower on the hill we could use too.
    </p>
    <p class="t">
      T: Besides, in practice, most of us keep a pocket dictionary
      for each language we use in our briefcases. It isn't as
      though we need so many dictionaries. Frankly, dictionaries
      have such different requirements to books I am shocked to see
      this dictionary in your section of the library! If you don't
      take it out out, I will bite your heel.
    </p>
    <p class="a">
      A: But I thought when we designed the library it was so that
      any sort of book could go in it. That is why we called it the
      Global Eternal Bibliotech, after all: it is Good for Every
      Book. I should be able to keep this dictionary in it simply
      because it is a book.
    </p>
    <p class="t">
      T: But Achilles, for the last time, a dictionary is
      <strong>not a book</strong>!
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 10 Dec 2017 00:00:00 GMT</pubDate>
  <title>Net Neutrality: Act now to save the internet as we know it</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/NetNeutrality.html</link>
    <guid>https://www.w3.org/DesignIssues/NetNeutrality.html</guid>
      <description><![CDATA[
    <h1>
      Act now to save the internet as we know it
    </h1>
    <div <hr="">
    <div class-"cols"="">
<p>

2017-12-10: In just two days, the Federal Communications Commission (FCC) will vote on a proposal that would fatally undermine net neutrality in the US. This would be a disaster for the internet.
</p>

<img src="diagrams/policy/HelpSaveNetNeutrality.png" style="width:50%; margin: 2em;">

<p>
Net neutralityâ€Šâ€”â€Šthe principle that internet service providers (ISPs) treat all traffic equallyâ€Šâ€”â€Šunderpins the internet as we know it today. It has allowed millions of Americans to build businesses, connect with friends and family, launch social movements, and share their ideas freely.
</p><p>
When I invented the World Wide Web in 1989, I didnâ€™t have to pay a fee, or ask anyone for permission to make it available over the internet. All I had to do was write a new app and plug my computer into the net. If US net neutrality rules are repealed, future innovators will have to first negotiate with each ISP to get their new product onto an internet package. That means no more permissionless space for innovation. ISPs will have the power to decide which websites you can access and at what speed each will load. In other words, theyâ€™ll be able to decide which companies succeed online, which voices are heardâ€Šâ€”â€Šand which are silenced.
</p><p>
Net neutrality separates the connectivity market from the content market. As separate markets, both have flourished. But if the US allows the internet to become like the old cable TV modelâ€Šâ€”â€Šwith the same firms controlling the cables and the contentâ€Šâ€”â€Šcompetition in both markets will suffer. As other countries maintain separate and fiercely competitive markets, America will decline as the worldâ€™s chief digital innovator.
</p><p>
In the early years of the internet, ISPs didnâ€™t have the technical capacity to discriminate traffic online. Their computers were not fast enough and so net neutrality was a fact of life. Over time, as technology developed and the value of content flowing through the network increased, ISPs developed the ability and the incentives to discriminate internet traffic to get a cut of the spoils. We need rules to keep ISPs focused on what they do best: making access cheaper and faster.
</p><p>
Historically, under both Republican and Democratic leadership, the FCC has worked to ensure net neutrality principles were respected, sending a message to ISPs that they could not engage in blocking content, throttling site speed, or charging for content to be prioritised. In 2015, these net neutrality principles were formalised via strong new rules to ensure the internet remained free and open.
</p><p>
But now the new FCC leadership is trying to demolish these protections.
</p><p>
The FCCâ€™s proposal, if voted through on December 14, would open the door for ISPs to act on short-term incentives and upend the internet as we know it.
Net neutrality protections are vital to protect the future of competition and innovation in the US; they uphold our right to express ourselves freely and to choose what we read online and who we communicate with. These are American values and fundamental to democracy. Without net neutrality, ISPs would be allowed to exploit their power as gatekeepers, closing the door to the creativity and innovation that make the internet great.
</p><p><b>
I want an internet where content businesses grow according to their quality, not their ability to pay to ride in the fast lane. I want an internet where ideas spread because theyâ€™re inspiring, not because they chime with the views of telecoms executives. I want an internet where consumers decide what succeeds online, and where ISPs focus on providing the best connectivity.
</b>
</p><p>
If thatâ€™s the internet you wantâ€Šâ€”â€Šact now. Not tomorrow, not next week. Now.
</p><p>
Over the past few months, I have traveled to Washington, DC to meet with the FCC leadership and almost a dozen members of Congress from both parties. Some, more than others, understood the importance of preserving net neutrality. But they all understand how many votes they need to stay in office. Now is the time to speak up for freedom and fairness.
</p><p>
Our best hope is to make sure our representatives in Congress know that we will hold them to account on this issue, so they call upon FCC Chairman Ajit Pai to suspend Thursdayâ€™s vote.
</p><p>
Contact your representatives and tell them to save net neutrality.</p>
    <p></p>
    </div>
    <hr>
    <div class="nav">
    <p>
      <a href="Overview.html">Up to Design Issues</a>
    </p>
    <p>
      <a href="../People/Berners-Lee">Tim BL</a>
    </p>
    </div>
  

</div>]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 09 Mar 2009 00:00:00 GMT</pubDate>
  <title>No Snooping</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/NoSnooping.html</link>
    <guid>https://www.w3.org/DesignIssues/NoSnooping.html</guid>
      <description><![CDATA[
    <h1>
      No Snooping
    </h1>
    <p>
      Most of these notes are about architecture at the web layer.
      However, a healthy web for society places requirements also
      on the Internet layer.
    </p>
    <p>
      In 2008, this was threatened in the UK by the company
      <a href="http://en.wikipedia.org/wiki/Phorm">Phorm</a>
      proposing to use data from deep packet inspection (DPI). The
      system would use special apparatus at the Internet Service
      Provider (ISP) to monitor traffic, peek inside the IP
      packet's payload, and determine every URL looked in a
      household's browsing on the web. This profile would be used
      to provide taregetted advertizing. They also planned to
      automatically "protect" users by redirecting any access to
      blacklisted (phishing, etc) sites.
    </p>
    <p>
      A discussion was held at the House of Lords by Baroness
      Miller on 2009-02-11. These are some notes I made for the
      event, which I attended.
    </p>
    <ol>
      <li>The Internet in general has and deserves the same
      protection as paper mail and telephone.
      </li>
      <li>If fact you could argue that it needs it more, as it
      carries more or our lives and is more revealing than our
      phone calls or our mail.
      </li>
      <li>The access by an ISP of information within an internet
      packet, other than that information used for routing, is
      equivalent to wirtetapping a phone or opening sealed postal
      mail.
      </li>
      <li>The URLs which people use reveal a huge amount about
      their lives, loves, hates, and fears. This is extremely
      sensitive material. People use the web in crisis, when
      wondering whether they have STDs, or cancer, when wondering
      whether they are homosexual and whether to talk about it, to
      discuss political views which may to some may be abhorrent,
      and so on.
      </li>
      <li>We use the internet to inform ourselves as voters in a
      democracy. We use the internet to decide what is true and
      what is not. We use the internet for healthcare and social
      interaction and so on. These things will all have a
      completely different light cast on then if the users know
      that the click will be monitored and the data will be shared
      with third parties.
      </li>
      <li>The URLs produced when using forms contain the
      information typed into those forms. Personal data, private
      data.
      </li>
      <li>If people really want privacy, then many users and sites
      may switch to using SSL encryption: to doing theior actual
      web surfing thorugh an encrypted tunnel. This takes a lot of
      server CPU cycles, making server farms more expensive. It
      would slow the user's computer. It would effectively slow
      down the whole net. It also prevents the use of HTTP proxies,
      which currently help the efficiency of web access.
      </li>
      <li>There are considerable risks if the information is
      abused. Imagine:
        <ul>
          <li>To be able to buy a profile of a person you are
          interested in;
          </li>
          <li>To discriminate based on profiles of people when
          deciding whether suitable to employ them;
          </li>
          <li>To discriminate in giving life insurance, and so on,
          against those the have lookup up (say) cardiac symptoms
          on the web;
          </li>
          <li>Criminal attacks on government officials at home;
          </li>
          <li>Foreign attacks on the country made by targeting and
          analyzing key individuals;
          </li>
          <li>Predators choosing, stalking, and targeting
          victims;...
          </li>
        </ul>
        <p>
          to name a few.
        </p>
      </li>
      <li>The information could be deliberately abused by an inside
      worker, or could be acquired by an attack on the system's
      machines.
      </li>
      <li>The power of this information is so great that the
      commercial incentive for companies or individuals misuse it
      will be huge, so it is essential to have absolute clarity
      that it is illegal.
      </li>
      <li>To put his in perspective, it is like the company having
      a video camera inside your house, except that it gives them
      actually much more information about you.
      </li>
    </ol>
    <p>
      The act of reading, like the act of writing, is a pure,
      fundamendal, human act. It must be available without
      interference or spying.
    </p>
    <h3>
      Acknowledgements
    </h3>
    <p>
      Thanks to colleagues who reviewed these notes and provided
      useful feedback, including Hal Abelson, Karen Myers, Thomas
      Rössler, Amy van der Hiel, and Danny Weitzner
    </p>
    <h3>
      References
    </h3>
    <p>
      Phorm in Wikipedia http://en.wikipedia.org/wiki/Phorm
    </p>
    <p>
      The author on BBC news disapproving of the spying on people's
      URLs: http://news.bbc.co.uk/2/hi/technology/7299875.stm
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Oct 2000 00:00:00 GMT</pubDate>
  <title>Notation3</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Notation3.html</link>
    <guid>https://www.w3.org/DesignIssues/Notation3.html</guid>
      <description><![CDATA[
  <img alt="n3" src="file:///home/RDF/icons/n3_small" align="right">

  <h1>Notation 3 Logic</h1>
  <p>This article gives an operational
  semantics for Notation3 (N3) and some RDF&nbsp;properties for
  expressing logic.&nbsp;These properties, together with N3's
  extensions of RDF to include variables and nested graphs, allow
  N3 to be used to express rules in a web environment. &nbsp;<br>
  <br>
  This is an informal semantics in that should be understandable by
  a human being but is not&nbsp;a machine readable formal
  semantics. This document is aimed at a logician wanting to a
  reference by which to compare N3 Logic with other languages, and
  at the engineer coding an implementation of N3 Logic and who
  wants to check the detailed semantics.<br>
  <br>
</p>
  <p>These properties are not part of the N3 language, but are
  properties which allow N3 to be used to express rules, and rules
  which talk about the provenance of information, contents of
  documents on the web, and so on.&nbsp; Just as OWL is expressed
  in RDF by defining properties, so rules, queries, differences,
  and so on can be expressed in RDF with the N3 extension to
  formulae.</p>

  <p>The log: namespace has functions, which have built-in meaning
  for CWM and other software.</p>

  <p>See also:</p>

  <ul>
    <li><a href="/2000/10/swap/log.n3">The schema
    for the log: namespace</a></li>

    <li><a href="Diff.html">A
    vocabulary for expressing differences between RDF
    graphs</a></li>

    <li><a href="http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0004.html">
    a formal design for RDF/N3 context/scopes</a><br>
    Dan Connolly to www-rdf-logic, Thu, Sep 06 2001</li>
  </ul>

  <p>The prefix log:&nbsp;&nbsp;is used below as shorthand for the
  namespace &lt;<a href="http://www.w3.org/2000/10/swap/log#">http://www.w3.org/2000/10/swap/log#</a>&gt;.
  See the <a href="/2000/10/swap/logic.n3">schema</a>
  for a summary.</p><br>

  <h2><a name="motivation" id="motivation"></a>
  Motivation</h2><br>
  The motivation of the logic was to be useful as a tool in in open
  web environment.   The Web contains many sources of
  information, with different characteristics and relationships to
  any given reader.  Whereas a closed system may be built
  based on a single knowledge base of believed facts, an open
  web-based system exists in an unbounded sea of interconnected
  information resources. This requires that an agent be aware of
  the provenance of information, and responsible for its
  disposition.  The language for use in this environment
  typically requires the ability to express what document or
  message said what, so the ability to quote subgraphs and match
  them against variable graphs is essential.  This
  quotation and reference, with its inevitable possibility of
  direct or indirect self-reference, if added directly to first
  order logic presents problems such as paradox traps. To avoid
  this, N3 logic has deliberately been kept to limited expressive
  power: it currently contains no general first order
  negation.  Negated forms of many of the built-in
  functions are available, however.<br>
  <br>
  A goal is that information, such as but not limited to rules,
  which requires greater expressive power than the RDF graph,
  should be sharable in the same way as RDF can be
  shared.  This means that one person should be able to
  express knowledge in N3 for a certain purpose, and later
  independently someone else reuse that knowledge for a different
  unforeseen purpose.  As the context of the later use is
  unknown, this prevents us from making implicit closed assumptions
  about the total set of knowledge in the system as a whole.<br>
  <br>
  Further, we require that other users of N3 in the web can express
  new knowledge without affecting systems we have already
  built.  This means that N3 must be fundamentally monotonic:
  the addition of new information from elsewhere, while it might
  cause an inconsistency by contradicting the old information
  (which would have to be resolved before the combined system is
  used), the new information cannot silently change the meaning of
  the original knowledge.<br>
  <br>
  The non-monotonicity of many existing systems follows from a form
  of negation as failure in which a sentence is deemed false if it
  not held within (or, derivable from)  the<span style="font-style: italic;">current knowledge
  base</span>.  It is this concept of current knowledge
  base, which is a variable quantity, and the ability to
  indirectly make reference to it which causes the
  non-monotonicity.  In N3Logic, while a current
  knowledge base is a fine concept, there is no ability to make
  reference to it implicitly in the negative.   The
  negation provided is the ability only for a specific given
  document (or, essentially, some abstract formula) to objectively
  determine whether or not it holds, or allows one to derive, a
  given fact.  This has been called <span style="font-style: italic;">Scoped Negation As Failure</span>
  (SNAF).<br>
  <br>

  <h2><a name="syntax" id="syntax"></a> Formal syntax</h2><br>
  The syntax of N3 is defined by the <a href="http://www.w3.org/2000/10/swap/grammar/n3-report.html">context-free
  grammar</a>  This is available in machine-readable form in
  <a href="http://www.w3.org/2000/10/swap/grammar/n3.n3">&nbsp;Notation3</a>
  and  <a href="http://www.w3.org/2000/10/swap/grammar/n3.rdf">RDF/XML.</a><br>

  <br>
  The top-level production for an N3 document is
  <http://www.w3.org/2000/10/swap/grammar/n3#document>.<br>
  <br>
  In the semantics below we will consider these productions using
  notation as follows.<br>
  <br>

  <table style="text-align: left; width: 100%;" border="1" cellpadding="2" cellspacing="2">
    <tbody>
      <tr>
        <th>Production</th>

        <th>N3 syntax examples</th>

        <th>notation below for instances</th>
      </tr>

      <tr>
        <td>symbol</td>

        <td><span style="font-family: monospace;">&lt;foo#bar&gt;
        &nbsp; &nbsp;&lt;http://example.com/&gt;</span></td>

        <td>c d e f</td>
      </tr>

      <tr>
        <td>variable</td>

        <td>Any symbol quantified by @forAll or @forSome in the
        same or an outer formula.</td>

        <td><span style="font-style: italic;">x y z</span></td>
      </tr>

      <tr>
        <td>formula</td>

        <td><span style="font-family: monospace;">{&nbsp; ...
        &nbsp;}</span> &nbsp;or an entire document</td>

        <td>F &nbsp;G H K</td>
      </tr>

      <tr>
        <td>set of universal variables of F</td>

        <td><span style="font-family: monospace;">@forAll :x,
        :y.</span></td>

        <td>uvF</td>
      </tr>

      <tr>
        <td>set of existential variables of F</td>

        <td><span style="font-family: monospace;">@forSome :z,
        :w.</span></td>

        <td>evF</td>
      </tr>

      <tr>
        <td>set of statements of F</td>

        <td></td>

        <td>stF</td>
      </tr>

      <tr>
        <td>statement</td>

        <td>&nbsp; <span style="font-family: monospace;">&lt;#myCar&gt; &nbsp;
        &lt;#color&gt; &nbsp; "green".</span></td>

        <td>F<span style="font-style: italic;">i</span> &nbsp; or
        &nbsp;{s p o}</td>
      </tr>

      <tr>
        <td>string</td>

        <td><span style="font-family: monospace;">"hello
        world"</span></td>

        <td>s</td>
      </tr>

      <tr>
        <td>integer</td>

        <td><span style="font-family: monospace;">34</span></td>

        <td>i</td>
      </tr>

      <tr>
        <td>list</td>

        <td>( 1 2 ?x &nbsp;&lt;a&gt; )</td>

        <td>L M</td>
      </tr>

      <tr>
        <td>Element i of list L</td>

        <td></td>

        <td>L<span style="font-style: italic;">i</span><br></td>
      </tr>

      <tr>
        <td>length of list</td>

        <td></td>

        <td>|L|</td>
      </tr>

      <tr>
        <td>expression</td>

        <td>see grammar</td>

        <td>n m</td>
      </tr>

      <tr>
        <td>Set*</td>

        <td>{$ &nbsp;1, 2, &lt;a&gt; $}</td>

        <td>S T<br></td>
      </tr>
    </tbody>
  </table><br>
  *The set syntax and semantics are not part of the current
  Notation3 language but are under consideraton.<br>

  <h2><a name="semantics" id="semantics"></a> Semantics</h2><br>
  <span style="font-style: italic;">Note.&nbsp;&nbsp;The Semantics
  of a generic RDF statement are not defined here.&nbsp;&nbsp;The
  extensibility of RDF is deliberately such that a document may
  draw on predicates from many sources.&nbsp;&nbsp;The statement {n
  c m} expresses that the relationship denoted by c holds between
  the things denoted by n and m.&nbsp;&nbsp;The meaning of
  the&nbsp;&nbsp;statement {n c m} in general is defined by any
  specification for c. The Architecture of the WWW specifies
  informally how the&nbsp; curious can discover information about
  the relation. It discusses how the architecture and management of
  the WWW is such that a given social entity has jurisdiction over
  certain symbols (though for example domain name ownership). This
  philosophy and architecture is not discussed further
  here.&nbsp;&nbsp;Here though we do define the semantics of
  certain specific predicates which allow the expression of the
  language.&nbsp;&nbsp;In analyzing the language the reader is
  invited to consider statements of unknown meaning ground
  facts.&nbsp;&nbsp;N3Logic defines the semantics of certain
  properties. Clearly a system which recognizes further logical
  predicates, beyond those defined here, whose meaning introduces
  greater logical expressiveness would change the properties of the
  logic.</span><br>
  <br>

  <h3>Simplifications</h3>N3 has a number of types of shortcut
  syntax and syntactic sugar.  For simplicity, in this article
  we consider a language simpler the full N3 syntax referenced
  above though just as expressive, in that we ignore most syntactic
  sugar. The following simplifications are made.<br>
  <br>
  We ignore syntactic sugar of comma and semicolon as shorthand
  notations.   That is, we consider a simpler language in
  which any such syntax has been expanded out. Loosely:<br>
  <br>

  <table style="text-align: left; width: 100%;" border="1" cellpadding="2" cellspacing="2">
    <tbody>
      <tr>
        <th>A sentence of the form</th>

        <th>becomes two sentences</th>
      </tr>

      <tr>
        <td>subject &nbsp; <span style="font-style: italic;">stuff</span> ; <span style="font-style: italic;">morestuff</span> .</td>

        <td>subject <span style="font-style: italic;">stuff</span>
        . &nbsp;subject <span style="font-style: italic;">morestuff</span> .</td>
      </tr>

      <tr>
        <td>subject predicate <span style="font-style: italic;">stuff</span> , &nbsp;object .</td>

        <td>subject predicate <span style="font-style: italic;">stuff</span>&nbsp; subject
        predicate&nbsp;object .</td>
      </tr>
    </tbody>
  </table><br>
  <br>
  For those familiar with N3, the other simplifications in the
  language considered here are as follows.<br>

  <ul>
    <li>&nbsp;prefixes have been expanded and all qualified names
    replaced with symbols using full URIs between angle
    brackets.</li>

    <li>The path syntax which uses&nbsp;&nbsp; "!" and "^"&nbsp; is
    assumed expanded into its equivalent blank node form;</li>

    <li>The "is ... of " backwards construction has been replaced
    by the equivalent forwards direction syntax.</li>

    <li>The "=" syntax is not used as shorthand for owl:sameAs. In
    fact, we use = here in the text for value equality.</li>

    <li>@keywords is not used</li>

    <li>The &nbsp;@a &nbsp;shorthand for rdf:type is replaced with
    a direct use of the full URI symbol for rdf:type</li>

    <li>all ?x forms are replaced with explicit universal
    quantification in the enclosing parent of the current
    formula.</li>
  </ul><br>
  Notation3 has explicitly quantified existential variables as well
  as blank nodes.  The description below does not mention
  blank nodes, although they are very close in semantics to
  existentially quantified variables.   We consider for
  now a simpler language in which blank nodes have been replaced by
  explicitly named variables  existentially quantified in
  the same formula.<br>
  <br>
  We have only included strings and integers, rather than the whole
  set of RDF types an user-defined types.<br>
  <br>
  These simplifications will not deter us from using N3 shorthand
  in examples where it makes them more readable, so the reader is
  assumed familiar with them.<br>

  <h2>Defining N3 Entailment</h2>The RDF specification defines a
  very weak form of entailment, known as RDF entailment or simple
  entailment.  He we define the equivalent very simple
  N3-entailment. This does not provide us with useful powers of
  inference: it is almost textual inclusion, but
  just  has conjunction elimination (statement removal) ,
  universal elimination, existential introduction and variable
  renaming. Most of this is quite traditional.  The
  only thing to distinguish N3 Logic from typical logics is
  the Formula, which allows N3 sentences to make statements about
  N3 sentences.   The following details are included for
  completeness and may be skipped.<br>

  <h3>Substitution</h3><span style="font-style: italic;">Substitution is defined to recursively
  apply inside compound terms, as is usual.&nbsp;&nbsp;Note only
  that substitution does descend into compund terms, while
  substitution of owl:sameAs, discussed later, does
  not.</span><br>
  <br>
  We define a substitution operator  
  σ<sub><span style="font-style: italic;">x</span>/m</sub>
   which replaces occurrences of the variable <span style="font-style: italic;">x</span>. with the expression m.  For
  compound terms, substitution of a compound term (list,
  formula or set) is performed by performing substitution of
  each component, recursively.<br>
  <br>
  Abbreviating  the substitution
   σ<sub><span style="font-style: italic;">x</span>/m</sub>
  as  σ , we define substitution operator as
  usual:<br>
  <br>
  σ<span style="font-style: italic;">x</span> = m  
      (<span style="font-style: italic;">x</span> is
  replaced by m)<br>
  σ<span style="font-style: italic;">y</span> = <span style="font-style: italic;">y</span>        (y not
  equal to x)<br>
  σa = a        (symbols and literals are
  unchanged)<br>
  σi = i<br>
  σs = s         <br>
  σ( a b ... c )  =  ( σa σb ...
  σc )              
          (substitution goes into compound
  terms)<br>
  σ{$ a, b, ... c  $}   =  {$ σa,
  σb, ... σc  $}<br>
  uv σF  = σ uvF<br>
  ev σF  = σ evF<br>
  st  σF = σ stF<br>
  <br>
  In general a substitution operator is the sequential application
  of single substitutions:<br>
  <br>
  σ = σ<sub><span style="font-style: italic;">x</span>1/m1</sub>σ<sub><span style="font-style: italic;">x</span>2/m2</sub>σ<sub><span style="font-style: italic;">x</span>2/m2</sub> ...
  σ<sub><span style="font-style: italic;">x</span>n/mn</sub><br>
  <br>

  <h3>Value equality&nbsp;</h3><br>
  <span style="font-style: italic;">Value equality between terms is
  defined in an ordinary way, compatible with RDF.</span><br>
  <br>
  For concepts which exist in RDF, we  use RDF equality. This
  is RDF node equality.  These atomic concepts have a simple
  form of equality.<br>
  <br>
  For lists, equality is defined as a pairwise matching.<br>
  <br>
  For sets, equality is defined as a mapping between equal terms
  existing in each direction.<br>
  <br>
  For formulae, equality F = G is defined as a 
  substitution σ existing mapping variables to
  variables.  (Note that as here RDF Blank Nodes are
  considered as existential variables, the substitution will map
  b-nodes to b-nodes.)<br>
  <br>
  The table below is a summary for completeness.<br>
  <br>

  <table style="text-align: left; width: 100%;" border="1" cellpadding="2" cellspacing="2">
    <tbody>
      <tr>
        <th>Production</th>

        <th>Equality</th>
      </tr>

      <tr>
        <td>symbol</td>

        <td>uri is equal unicode string</td>
      </tr>

      <tr>
        <td>variable</td>

        <td>variable name is equal unicode string</td>
      </tr>

      <tr>
        <td>formula</td>

        <td>&nbsp;F = G iff &nbsp; |stF| = |stG| and there is some
        substitution&nbsp; σ such
        that&nbsp;(∀<span style="font-style: italic;">i</span> .&nbsp;∃<span style="font-style: italic;">j</span> .&nbsp; σ<span style="font-style: italic;">Fi</span> = σG<span style="font-style: italic;">j.&nbsp;</span>)</td>
      </tr>

      <tr>
        <td>statement</td>

        <td>&nbsp;Subjects are equal, predicates are equal, and
        objects are equal</td>
      </tr>

      <tr>
        <td>string</td>

        <td>&nbsp;equal unicode string</td>
      </tr>

      <tr>
        <td>integer</td>

        <td>&nbsp;equal integer</td>
      </tr>

      <tr>
        <td>list L = M</td>

        <td>&nbsp;|L|&nbsp; =&nbsp; |M| &nbsp; &nbsp; &nbsp;
        &nbsp;&amp; &nbsp; &nbsp;(∀<span style="font-style: italic;">i</span> . L<span style="font-style: italic;">i</span> = M<span style="font-style: italic;">i &nbsp;)</span></td>
      </tr>

      <tr>
        <td>set &nbsp; S = T&nbsp;</td>

        <td>(∀<span style="font-style: italic;">i</span>
        .&nbsp;∃<span style="font-style: italic;">j</span>
        .&nbsp; S<span style="font-style: italic;">i</span> =
        T<span style="font-style: italic;">j.&nbsp;</span>) &nbsp;
        &amp; &nbsp;(∀<span style="font-style: italic;">i</span> .&nbsp;∃<span style="font-style: italic;">j</span> .&nbsp; S<span style="font-style: italic;">i</span> = T<span style="font-style: italic;">j.&nbsp;</span>)</td>
      </tr>

      <tr>
        <td>formula F = G</td>

        <td>∃σ<span style="font-style: italic;">.&nbsp;</span>σ F
        =&nbsp;σ G</td>
      </tr>

      <tr>
        <td style="font-style: italic;">unicode string</td>

        <td>Unicode strings should be in canonical form. They are
        equal if the corresponding characters have numerically
        equal code points.</td>
      </tr>
    </tbody>
  </table><br>

  <h3>Conjunction</h3><span style="font-style: italic;">N3, like
  RDF, has an implied conjunction, with its normal properties,
  between the statements of a formula.&nbsp;</span><br>
  <br>
  The semantics of a formula which has no quantifiers (@forAll or
  @forSome) are the conjunction of the semantics of the statements
  of which it is composed.<br>
  <br>
  We define the conjunction elimination operator ce(i) of removing
  the statement F<span style="font-style: italic;">i</span> from
  formula F.  By the conventional semantics of conjunction,
  the ce(i) operator is truth-preserving.  If you take a
  formula and remove a statement from it it is still true.<br>
  <br>
  CE:   From     F  follows    ce(i)
   F<br>

  <h3>Existential quantification</h3><span style="font-style: italic;">Existential quantifiers and Universal
  quantifiers have the usual qualities</span><br>
  Any formula, including the <span style="font-style: italic;">root
  formula</span> which matches the "document" production of the
  grammar,  may have a set of existential variables indicated
  by an <span style="font-family: monospace;">@forSome</span>
  declaration.   This indicates that, where the formula
  is considered true, it is true for at least one substitution
  mapping the existential variables onto non-variables.<br>
  <br>
  As usual, we define a truth-preserving  Existential
  Introduction operator on formulae, that of introducing an
  existentially quantified variable in place of any term. The
  operation  ei(x, n) is defined as<br>

  <ol>
    <li>Creation of a new variable <span style="font-style: italic;">x</span> which occurs&nbsp;nowhere
    else</li>

    <li>The application of&nbsp;σ<sub><span style="font-style: italic;">x</span>/n</sub> to F</li>

    <li>The addition of<span style="font-style: italic;">x</span>
    &nbsp;to evF.</li>
  </ol><br>
  EI:    From  F   follows  ei(x,n)
   F    for any <span style="font-style: italic;">x</span> not occurring anywhere else<br>
  <br>

  <h3>Universal quantification</h3><br>
  Any formula,  (including the root formula), may have a set
  of universal variables.  These are indicated by 
  <span style="font-family: monospace;">@forAll</span>
   declarations.  The scope of the @forAll is outside the
  scope of any @forSome.<br>
  <br>

  <p>If both universal and existential quantification are specified
  for the same context, then the scope of the universal
  quantification is outside the scope of the existentials:</p>
  <pre>{ @forAll &lt;#h&gt;. @forSome &lt;#g&gt;. &lt;#g&gt; &lt;#loves&gt; &lt;#h&gt; }.
</pre>

  <p>means</p>

  <p>∀&lt;#h&gt;&nbsp; ( ∃&lt;#g&gt; &nbsp;((
  <span style="font-family: monospace;">&nbsp;</span>&lt;#g&gt;
  &lt;#loves&gt; &lt;#h&gt; ))</p><br>
  The semantics of @forAll is that  for any substitution
  σ = subst(<span style="font-style: italic;">x</span>, n)
  where  x member of  uvF,  if  F is true then
  σF is also true.  Any @forAll declaration may also be
  removed, preserving truth.  Combining these, we define a
  truth-preserving operation  ue(x, n)  such that
   ue(x, n) F is formed by<br>

  <ol>
    <li>Removal of &nbsp;x from &nbsp;evF</li>

    <li>Application of subst(x, n)</li>
  </ol>We have the axiom of universal elimination<br>
  <br>
  UE:  From     F       follows
    ue(x, n)   F    for all x in evF<br>
  As the actual variable used in a formula is quite irrelevant to
  its semantics, the operation of replacing that variable with
  another one not used elsewhere within the formula is
  truth-preserving.<br>
  <br>

  <h3>Variable renaming</h3><br>
   We define the operation of variable renaming
  vr(<span style="font-style: italic;">x,y</span>) on F when x is a
  member of uvF or is a member of evF.<br>
  <br>
  VR:  From   F   follows  
   vr(<span style="font-style: italic;">x, y</span>) F  
   where  <span style="font-style: italic;">x</span> is
  in uvF or evF and <span style="font-style: italic;">y</span> does
  not occur in F<br>
  <br>
  Occurrence in F is defined recursively in the same way as
  substitution:  <span style="font-style: italic;">x</span>
  occurs in F iff σ<sub><span style="font-style: italic;">x</span>/n</sub>F is not equal to F for
  arbitrary n.<br>
  <br>

  <h3>Union of formulae</h3>The union H = F∪G of two formulae F
  and G is formed, as usual,  as follows.<br>
  <br>
  A variable renaming operator is applied to G such that the
  resulting formula G' has no variables which occur un-quantified
  or differently quantified or existentially quantified in F, and
  vice-versa.  (F and G' may share universal
  variables).ied or existentially quantified in F, and
  vice-ver<br>
  <br>
  F∪G is then defined by:<br>
  <br>
  st(F∪G) = stF ∪ st G'  ;    ev(F∪G)
   =  evF ∪ evG' ;     uv(F∪G) =
  uvF ∪ uv G'<br>
  <br>
  <br>

  <h3>N3 entailment</h3>

  <p>The operators conjunction elimination, existential
  elimination, universal introduction and variable
  renaming&nbsp;&nbsp;are truth preserving.&nbsp;&nbsp;We define an
  N3 entailment operator (τ) as any operator which is the
  successive application of&nbsp; any sequence (possibly empty) of
  such operators.&nbsp;&nbsp;We say a formula F n3-entails a
  formula&nbsp;&nbsp;τ F.&nbsp;&nbsp;By a combination
  of&nbsp;&nbsp;SE, EI, UE and VR,&nbsp;&nbsp; τ F logically
  follows from F.</p><span style="font-style: italic;">&nbsp;Note.
  &nbsp;RDF Graph is a subclass of N3 formula. &nbsp;If F and G are
  RDF graphs, only CI and EI apply and n3-entailment
  reduces&nbsp;to simple entailment from RDF Semantics. (@@check
  for any RDF weirdnesses)<br>
  <br></span>We have now defined this simple form of
  N3-entailment, which amounts to little more than textual
  inclusion in one expression of a subset of another.  We
  have not defined the normal collection of implication,
  disjunction and negation which first order logic, as N3logic does
  provide for first order negation.  We have, in the
  process,  defined a substitution operation which we can now
  use to define implication, which allows us to express
  rules.  <span style="font-style: italic;"><br></span><br>

  <h2>Logic properties and built-in functions</h2>We now define the
  semantics of N3 statements whose predicate is one of a small set
  of logic properties.  These are statements whose truth can
  be established by performing calculations, or by accessing the
  web.  <br>
  <br>
  One of our objectives was to make it possible to make statements
  about, and to query, other statements such as the contents of
  data in information resources on the web.  We have, in
  formulae, the ability to represent such sets  of statements.
   Now, to allow statements about them, we take some of the
  relationships we have defined and give them URIs so that these
  statements and queries can be written in N3.<br>
  <br>
  While the properties we introduced can be used simply as ground
  facts in a database,  is very useful to take advantage of
  the fact that in fact they can be calculated.  In some
  cases, the truth or falsehood of a binary relation can be
  calculated; in others, the relationship is a function so one
  argument (subject or object of the statement) can be calculated
  from the other.<br>
  <br>
  We now show how such properties are defined, and give examples of
  how an inference system can use them.  A motivation
  here is to do for logical information what RDF did for data: to
  provide a common data model and a common syntax, so that
  extensions of the language  are made simply by defining new
  terms in an ontology.  Declarative programing languages
  like scheme[@@] of course do this.  However, they differ in
  their choice of pairs rather than the RDF binary relational model
  for data, and lack the use of universal identifiers as
  symbols.  The goal with N3 was to make a
  minimal  extension to the RDF data model, so that the
  same language could be used for logic and data, which in practice
  are mixed as a colloidal solution in many real
  applications.<br>
  <br>

  <h3>Calculated entailment</h3><br>
  We introduce also a set of properties whose truth may be
  evaluated directly by machine.   We call these
  "built-in" functions.  The implementation as built-in
  functions is  not in general required for any
  implementation of the N3 language, as they can always soundly be
  treated as ground facts.  However, their usefulness
  derives from their implementation. We say that for example 
  { 1 math:negation  -1 } is entailed by
  calculation.    Like other RDF properties,
  the set is designed to be extensible, as others can use URIs for
  new functions. A much larger set of such properties is <a href="http://www.w3.org/2000/10/swap/doc/CwmBuiltins">described for
  example in the CWM bultt-ins list</a>, and the semantics of those
  are not described here.<br>
  <br>
  When the truth of a statement can be deduced because its
  predicate is a built-in function, then we call the derivation
   of the statement from no other evidence <span style="font-style: italic;">calculated entailment</span>.<br>
  <br>
  We now define a small set of such properties which provide the
  power of N3 logic for inference on the web.

  <h3>log:includes</h3>If a formula  G n3-entails another
  formula F,  this is expressed in N3 logic as<br>
  <br>
   F <span style="font-family: monospace;">log:includes</span>
  G.<br>
  <br>
  <span style="font-style: italic;">Note. &nbsp;In deference to the
  fact that RDF treats lists not as terms but as things constructed
  from first and rest pairs, we can view formulae which include
  lists as including rdf:first and rdf:rest statements. &nbsp;The
  effect on inclusion is that two other entailment operations are
  added: the addition of any statement of the form
  &nbsp;</span><span style="font-family: monospace; font-style: italic;">L rdf:first
  n</span><span style="font-style: italic;">where n is the first
  element of L, or L rdf:rest K where K is list forming the
  remaining non-first elements of L. &nbsp; This is not essential
  to a further understanding of the logic, nor to the operation of
  a system which does not contain any explicit mention of the terms
  rdf:first or rdf:rest.</span><br>
  <br>
  For the discussion of n3-entailment, clearly:<br>
  <br>
  From    F   and   F log:includes G  
  logically follows   G<br>
  <br>
  This can be calculated, because it is a mathematical operation on
  two compound terms.  It is typically used in a query to test
  the contents of a formula.  Below we will show how it can be
  used in the antecedent of a rule.<br>
  <br>

  <h3>log:notIncludes</h3><br>
  We write of formulae F and G:  F log:notIncludes G if it is
  <span style="font-weight: bold;">not</span> the case that G
  n3-entails F.<br>
  <br>
  As a form of negation, log:notincludes is completely monotonic.
   It can be evaluated by a mathematical calculation on the
  value of the two terms: no other knowledge gained can influence
  the result.  This is the <span style="font-style: italic;">scoped negation as failure</span> mentioned
  in the introduction.  This is not a non-monotonic negation
  as failure.<br>
  <br>

  <p><span style="font-style: italic;">Note on computation: To
  ascertain whether G n3-entails F in the worst case involves
  checking for all possible&nbsp;n3-entailment transformations
  which are combinations of the variables which occur in G. This
  operation may be tedious: it is strictly graph isomorphism
  complete. However&nbsp; the use of symbols rather than variables
  for a good proportion of nodes makes it much more tractable for
  practical graphs.&nbsp;&nbsp; The ethos that it is a good idea to
  give name things with URIs (symbols in N3) is a basic meme of web
  architecture [AWWW].&nbsp;&nbsp;It has direct practical
  application in the calculation of n3-entailment, as comparison of
  graphs whose nodes are labelled is much faster (of order n log
  (n)))&nbsp;</span></p>

  <h3><a name="log:implie" id="log:implie">log:implies</a></h3>The
  log:implies property relates two formulae, expressing
  implication.   The shorthand notation for log:implies is
    <span style="font-family: monospace;">=&gt;</span>
  .  A statement using log:implies, unlike log:includes,
  cannot be calculated.  It is not a built-in function,
  but the predicate which allows the expression of a rule.<br>
  <br>
  <span style="font-style: italic;">The semantics of implication
  are standard, but we elaborate them now for
  completeness.</span><br style="font-style: italic;">
  <br>
  F log:implies G is true if and only if when the formula F is true
  then also G is true.<br>
  <br>
  MP:   From    F  and     
  F => G     follows     G<br>
  <br>
  A statement in formula H is of the form F=>G can be
  considered as rule, in which case, the subject F is the premise
  (antecedent) of the rule, and the object G is the
  consequent.<br>
  <br>
  Implication is normally used within a formula with universally
  quantified variables.<span style="font-family: sans-serif;"><span style="font-style: italic;"><span style="font-weight: bold;"><br>
  <br></span></span></span>For example, universal quantifiers
  are  used with a rule in H as follows.  Here H is
  the formula containing the rules, and K the formula upon which
  the rules are applied, which we can call the knowledge
  base.<br>
  <br>
  If F => G is in H, and then for every σ which
  is a transformation composed of universal eliminations of
  variables universally quantified in H,  then  it also
  follows that σF => σG. Therefore, for
  every σ such that  K includes σF, 
  σG follows from K.<br>
  <br>
  In the particular case that H and K are both the knowledge base,
  or formula believed true at the top level, then<br>
  <br>
  GMP:    From      F  => G
   and  σF   follows    σG
        if σ is a transformation composed of
  universal eliminations of variables universally quantified at the
  top level.<br>

  <h4>Filtering</h4>When a knowledge base (formula) contains a lot
  of information, one way to filter off a subset is to run a set of
  rules on the knowledge base, and take only the new data which is
  generated by the rules.   This is the filter
  operation.<br>
  <br>
  When you apply rules to a knowledge base, the <span style="font-style: italic;">filter result</span> of rules in H applied
  to K is the union of all σG for every statement F
  => G which is in H,  for every σ which s
  a transformation composed of universal eliminations of variables
  universally quantified in H such that K includes σF.<br>

  <h4>Repeated application of rules</h4>When rules are added back
  repeatedly into the same knowledge base,  in order to
  prevent the unnecessary extra growth of the knowledge base,
  before adding σG to it,  there is a check to see
  whether the H already includes σG, and if it does, the
  adding of σG is skipped.<br style="font-style: italic;">
  <br>
  Let the result of rules in H applied to K, 
  ρ<sub>H</sub>K,  be the union of K with
  all σG for every statement F => G which is in
  H,  for every σ which is a transformation
  composed of universal eliminations of variables universally
  quantified in H, such that K includes σF, and K does not
  n3-entail σG.<br>
  <br style="font-style: italic;">
  <br>
  <span style="font-style: italic;">Note. This form of rule allows
  existentials in the consequent: it is not datalog.&nbsp;&nbsp;It
  is is clearly possible in a forward-chaining reasoner to generate
  an unbounded set of conclusions with rules of the&nbsp;form
  (using shorthand)</span><br style="font-style: italic;">
  <br style="font-style: italic;">
  <span style="font-style: italic;">&nbsp; { &nbsp;?x
  a&nbsp;:Person } &nbsp;=&gt; { ?x &nbsp;:mother [ a :Person]
  }.</span><br style="font-style: italic;">
  <br style="font-style: italic;">
  <span style="font-style: italic;">While this is a trap for the
  unwary user of a forward-chaining reasoner, it was found to be
  essential in general to be able to generate arbitrary RDF
  containing blank nodes, for example when translating information
  from one ontology into another.</span><br>
  <br>
  Consider the  repeated application of rules in H to K, 
  ρ<sup style="font-style: italic;"><span style="font-style: italic;">i</span></sup><sub>H</sub>K.  If there
  are no existentially quantified variables in the consequents of
  any of the rules in H, then this is like datalog, and there will
  be some threshold <span style="font-style: italic;">n</span>
  above which no more data is added, and there is a closure:
  ρ<sup style="font-style: italic;"><span style="font-style: italic;">i</span></sup><sub>H</sub>K =
  ρ<sup style="font-style: italic;"><span style="font-style: italic;">n</span></sup><sub>H</sub>K  for all
  <span style="font-style: italic;">i</span>><span style="font-style: italic;">n</span>.   In fact in many practical
  applications even with the datalog constraint removed, there is
  also a closure.  This ρ<sup>∞</sup><sub>H</sub>K
  is the result of running a forward-chaining reasoner on H and
  K.<br>

  <h4>Rule Inference on the knowledge base</h4>In the case in which
  rules are in the same formula as the data, the single rule
  operation can be written  ρ<sub>K</sub>K, and the
  closure under rule application
  ρ<sup>∞</sup><sub>K</sub>K<br>
  <span style="font-weight: bold;"><br></span> <span style="font-style: italic;">Cwm note: &nbsp; the --rules command line
  option calculates &nbsp;ρ</span><sub style="font-style: italic;">K</sub><span style="font-style: italic;">K,
  and the --think calculates&nbsp;ρ</span><sup style="font-style: italic;">∞</sup><sub style="font-style: italic;">K</sub><span style="font-style: italic;">K.
  &nbsp;The --filter=H calculates the filter result of H on the
  knowledge base.<br>
  <br></span>

  <h3><span style="font-style: italic;">Examples</span></h3>Here a
  simple rule uses log:implies.<br>
  <br>
  <pre>@prefix log: &lt;http://www.w3.org/2000/10/swap/log#&gt;.<br>@keywords.<br>@forAll x, y, z. {x parent y. y sister z} log:implies {x aunt z}
</pre>

  <p>This N3 formula has three universally quantified variables and
  one statement.&nbsp;&nbsp;The subject of the statement,&nbsp;</p>
  <pre>{x parent y. y sister z}
</pre>

  <p>is the antecedent of the rule and the object, &nbsp;</p>
  <pre>{x aunt z}
</pre>

  <p>is the conclusion. Given data</p>
  <pre>Joe parent Alan.<br>Alan sister Susie.<br><br>
</pre>

  <p>a rule engine would conclude</p>
  <pre>Joe aunt Susie.
</pre>

  <p>As a second example, we use a rule which looks inside a
  formula:</p>
  <pre>@forAll x, y, z.<br>{ x wrote y.<br>  y log:includes {z weather w}.<br>  x livesIn z<br>} log:implies {<br>  Boston weather y<br>}.
</pre>

  <p>Here the rule fires when x is bound to a symbol denoting some
  person who is the author of a formula y, when the formula makes a
  statement about the weather in (presumably some place) z, and x's
  home is z.&nbsp;&nbsp;That is, we believe statements about the
  weather at a place only from people who live there.&nbsp; Given
  the data</p>
  <pre>Bob livesIn  Boston.<br>Bob wrote  { Boston weather sunny }.<br>Alice livesIn Adelaide.<br>Alice wrote { Boston weather cold }.
</pre>

  <p>a valid inference would be</p>
  <pre>Boston weather sunny.
</pre>

  <h3>log:supports</h3><br>
  We say that F log:supports G if there is some sequence of
   rule inference and/or calculated entailment and/or n3
  entailment operators which when applied to F produce G.<br>
  <br>

  <h3>log:conclusion</h3><br>
  <br>
  The log:conclusion property expresses the relationship between a
  formula and its deductive closure under operations of
  n3-entailment, rule entailment and calculated entailment.
   <br>
  <br>
  As noticed above, there are circumstances when this will not be
  finite.<br>
  <br>
  log:conclusion is the transitive closure of log:supports.<br>
  <br>
  log:supports can be written in terms of log:conclusion and
  log:includes.<br>
  <br>
  { ?x log:supports ?y }   if and only dan   {
  ?x log:conclusion [ log:includes ?y ]}<br>
  <br>
  However, log:supports may be evaluated in many cases without
  evaluating log:conclusion: one can determine whether y can be
  derived from x in many ways, such as backward chaining, without
  necessarily having to evaluate the (possibly infinite) deductive
  closure.<br>
  <br>
  Now we have a system which has the capacity to do inference using
  rules, and to operate on formulae.  However, it
  operates in a vacuum.  In fact, our goal is that the
  system should operate in the context of the web.<br>
  <br>

  <h2>Involving the Web</h2>We therefore expose the web as a
  mapping between URIs and the information returned when such a URI
  is dereferenced, using appropriate protocols.  In
  N3,  the information resource is identified by a
  symbol, which is in fact is its URI. In N3, information is
  represented in formulae, so we represent the information
  retrieved as a formula.<br>
  Not all information on the web is, of course in N3. However the
  architecture we design is that N3 should here be the interlingua.
  Therefore, from the point of view of this system, the semantics
  of a document is exactly what can be expressed in N3, no more and
  no less.

  <h3>log:semantics**</h3>

  <p>c log:semantics F &nbsp;is true iff c is a document whose
  logical semantics expressed in N3 is the formula F.</p>

  <p>The relation between a document and the logical expression
  which represents its meaning expressed as N3.&nbsp;&nbsp; The
  Architecture of the World Wide Web [AWWW] defines algorithms by
  which a machine can determine representations of
  document&nbsp;&nbsp;given its symbol (URI).&nbsp;&nbsp;&nbsp;For
  a representation in N3, this is the formula which corresponds to
  the <span style="font-style: italic;">document</span> production
  of the grammar.&nbsp;&nbsp; For&nbsp;&nbsp;a representation in
  RDF/XML it is the formula which is the entire graph
  parsed.&nbsp;&nbsp;For any other languages, it may be calculated
  in as much&nbsp; a specification exists which defines the
  equivalent N3 semantics for files in that language.</p>

  <p>On the meaning of N3 formula</p>

  <p>This is not of course the&nbsp; semantics of the document in
  any absolute sense.&nbsp;&nbsp;It is the semantics expressed in
  N3.&nbsp;&nbsp;In turn, the full semantics of an N3 formula are
  grounded,&nbsp; in the definitions of the properties and classes
  used by&nbsp;the formula.&nbsp;&nbsp;In the HTTP space in which
  URIs are minted by an authority, definitive information about
  those definitions may be found by dereferencing the URIs. This
  information may be in natural language, in some
  machine-processable logic, or a mixture.&nbsp;&nbsp; Two patterns
  are important for the semantic web.&nbsp;</p>

  <p>One is the grounding of properties and classes by defining
  them in natural language.&nbsp;&nbsp;Natural language, of course,
  is not capable of giving an absolute meaning to anything in
  theory, but in practice a well written document, carefully
  written by a group of people achieves a precision of definition
  which is quite sufficient for the community to be able to
  exchange data using the terms concerned.&nbsp;&nbsp;The other
  pattern is the raft-like definition of terms in terms of related
  neighboring ontologies.</p>

  <p>&nbsp; @@@@ A full discussion of the grounding of meaning in a
  web of such definitions is beyond the scope of this
  article.&nbsp;&nbsp;Here we define only the operation semantics
  of a system using N3.</p>

  <p>@@@@ &nbsp;Edited up to here</p>The log:semantics of an N3
  document is the formula achieved by parsing representation of the
  document.<br>
  (Cwm note: Cwm knows how to go get a document and parse N3 and
  RDF/XML , in order to evaluate this. )<br>
  <br>
  Other languages for web documents  may be defined whose N3
  semantics are therefore also calculable, and so they could be
  added in due course.<br>
  See for example [GRDDL], [RDF/A], etc<br>

  <p>However, for the purpose of the analysis of the language, it
  is a convenient to&nbsp; consider the semantic web simply as a
  binary 1:1 relation between a subset of&nbsp;symbols and
  formulae.</p>

  <p>For a document in Notation3, log:semantics is the<br>
  log:parsedAsN3 of the log:contents of the document.<br>
  <br></p>

  <h3>log:says</h3>log:says is defined by:<br>
  <br>
  F  log:says  G   iff  ∃  H  .
    <span style="font-family: monospace;">F log:semantics
  &nbsp;H</span>   and   <span style="font-family: monospace;">H log:includes G</span> 
   <br>
  <br>
  In other words, loosely a document says something if a
  representation of it in the sense of the Architecture of the
  World Wide Web [AWWW] N3-entails it.<br>
  <br>
  The semantics of log:says are similar to that of says in
  [PCA].<br>
  <br>

  <h2>Miscellaneous</h2>

  <h3>log:Truth</h3>

  <p>This is a class of true formulae.&nbsp;</p>

  <p>From &nbsp; { F rdf:type log:Truth } &nbsp; &nbsp;follows
  &nbsp;F&nbsp; &nbsp;</p>

  <p>The cwm engine will process rules in the (indirectly
  command-line specified) formula or any formula which that
  declares to be a Truth.&nbsp;</p>

  <p>The dereifier will output any described formulae which are
  described as being in the class Truth.&nbsp;</p>This class is not
  at all central to the logic.

  <h2>Working with OWL</h2>

  <p>@@ Summary</p>

  <ul>
    <li>owl:sameAs considered the same as N3 value equality for
    data values.&nbsp;&nbsp;Axioms of
    equality.&nbsp;&nbsp;log:equalTo and
    log:notEqualTo&nbsp;&nbsp;compared with owl:SameAs. Compare
    math and string equality, and SPARQL equality.</li>

    <li>Operating in equality-aware mode.</li>

    <li>No attempt at connecting&nbsp;OWL DL language with the N3
    logic.&nbsp;</li>

    <li>Use of functional properties of a datatype conflicting with
    OWL DL.</li>
  </ul>

  <h2>Conclusion</h2>

  <p>The semantics of N3 have been defined, as have some built-in
  operator properties which add logical inference using rules to
  the language, and allow rules to define inference which can be
  drawn from specific web documents on the web, as a function of
  other information about those documents.</p>

  <p>The language has been found to have some useful practical
  properties.&nbsp;&nbsp;The separation between the Notation3
  extensions to RDF and the logic properties has allowed N3 by
  itself to be used in many other applications directly, and to be
  used with other properties to provide other functionality such as
  the expression of patches (updates) [Diff].</p>

  <p>The use of log:notIncludes to allow default reasoning without
  non-monotonic behavior achieves a design goal for distributed
  rule systems.</p><br>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 15 Jan 2013 00:00:00 GMT</pubDate>
  <title>The Many Meanings of Open</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Open.html</link>
    <guid>https://www.w3.org/DesignIssues/Open.html</guid>
      <description><![CDATA[
    <h1>
      The Many Meanings of Open
    </h1>
    
    <p>
    
    I was recently asked to talk about the idea of "open", and I realized the term is used in at least eight different ways.
    The distinct interpretations are all important in different but interlocking ways.
    Getting them confused leads to a lot of misunderstanding, so itâ€™s good to review them all.
</p>
<p>
When we tease apart their meanings, we can understand more clearly which aspects of each are the most important. The first, one of the most important forms of openness for the Web, is its universality.
</p>
<h2>
Universality
</h2>
<p>

When I designed the Web protocols, I had already seen many networked information systems fail because they made some assumptions about the users â€“ that they were using a particular type of computer for instance â€“ or constrained the way they worked, such as forcing them to organize their data in a particular way, or to use a particular data format. The Web had to avoid these issues.
The goal was that anyone should be able to publish anything on the Web and so it had to be universal in that it was independent of all these technical constraints, as well as language, character sets, and culture.
</p>
<p>

Net Neutrality is essential to an open, fair democracy 
Close to the principle of universality is that of decentralization, which means that no permission is needed from a central authority to post anything on the Web, there is no central controlling node, and so no single point of failure.  This has also been critical to the Webâ€™s growth and is critical to its future.
</p>
<h2>

Open Standards
</h2>
<p>

The actual design of the Web involved the creation of open standards â€“ and getting people to agree to use them globally. The World Wide Web Consortium (W3C), of which I am the Director, helps create interoperable standards for Web technology, including HTML5, mobile Web, graphics, the Semantic Web of linked data, and Web accessibility. Any company can join and anyone can review and help create the specifications for the Web.
</p>
<p>
The W3C process emphasizes transparency, openness, and consensus. In practice, fairness, technical quality and speed of process are always a trade-off to balance.
</p>
<p>

Other bodies work on other layers of the design. The IEEE for physical internet connectivity and the IETF for internet interoperability, for instance. Organizations like the IETF, IEEE and W3C support â€˜OpenStandâ€™ which encourages the development of market-driven standards that are non-national, free to access, open to participation, and (for W3C) free of royalty payments.
</p>
<h2>
Open Web Platform (OWP)
</h2>
<p>

W3Câ€™s Open Web Platform is the name for a particular set of open standards which enable an exciting stage of Web computing. Standards such as HTML5, SVG, CSS, video, JavaScript, and others are advancing together so that programmes that once worked only on desktop, tablets or phones can now work from within the browser itself. It has all the power of HTML5, like easily-inserted video and, in the future, easily-inserted conferences. It also features the APIs for accessing hardware and other capabilities on the device, such as a smartphoneâ€™s accelerometer, camera, and local storage. While native apps are limited, Web Apps can work on any platform.
</p>
<p>

With Web Apps, every Web page can become a programmable computer, whether it is on a mobile device, a desktop, a TV, or, in the future, a car console. Native apps are not on the Web, theyâ€™re not part of the Web. Native apps that work on a single platform or even a single device are thus more limited than a Web App. I therefore encourage everyone to build Web Apps.
</p>
<h2>
Open Government through Open Data
</h2>
<p>

In 2009, I resolved to encourage more use of data on the Web. Too many websites could generate nice reports as documents, but had no way to access the data behind it to check and build on the results.  In February that year I stood up in front of a TED audience and asked them for their data; I even got them to chant: â€œraw data nowâ€�.  In April that year, I met with Gordon Brown, then Prime Minister of the UK and with him began the UK Governmentâ€™s ground-breaking work on Open Data. That same year President Barack Obama announced his commitment to the US Open Government Initiative. In 2010 I went back to TED and showed the audience some of what had been achieved, including Open Street Mapâ€™s role in relief efforts in Haiti.
Itâ€™s important to me that I can get at the source code of any software Iâ€™m using.  
In 2012 we launched in the UKâ€™s Open Data Institute (ODI).
 ODI is a non-profit institute that incubates startups and promotes open-data businesses
  in East Londonâ€™s Tech City. ODI was created to take advantage of and to help guide the wave of open data adoption by business that is happening now.  It is an exciting time for open data, but there is a huge amount more to do.
</p>
<h2>Update 2024: Open Science</h2>
<p>We have just (May 2024) celebrated the 70 years of the European Particle Physics Lab, CERN, in Geneva. 
  A lot of the talks were about the importance of Open Science.
</p>
<h2>
Openness with personal data on the Social Net
</h2>
<p>

The word â€œopenâ€� is often used in the sense â€œI wouldnâ€™t be that open with my personal lifeâ€�. We are as a society learning how to draw the right boundaries in this new age. I wonâ€™t go into this in detail, but connected issues include the extent to which a social networking site which helps people share information also benefits from the data in completely different unforeseen ways, and ideas about what different sorts of data about people should be used for anyway. These issues may lead to be cultural norms as well as possibly new technical architectures.
</p>
<h2>
Open Platform
</h2>
<p>

While itâ€™s not really a feature of the Web, a concern for a lot of people is whether they can choose which apps run on their own phone or computer. An Open Platform means having the right to install and write software on your computer or device. One motivation to close off a computing platform comes from a manufacturer wanting to allow you to experience their content on your machine without being able to store it or pass it on. Some systems are very closed, in that the user can only watch a movie or play a game, with no chance to copy anything or back it up. Some systems are very open, allowing users to take copies of files and run any application they like. Many systems fall in between, letting users pay for additional material or an experience.
</p>
<p>
The W3C community is currently exploring Web technology that will strike a balance between the rights of creators and the rights of consumers. In this space in particular, W3C seeks to lower the overall proprietary footprint and increase overall interoperability, currently lacking in this area.
</p>
<p>

In the US particularly, the situation is aggravated by the Digital Millennium Copyright Act (DMCA) and the Computer Fraud and Abuse Act (CFAA), two laws which allow a person who uses a computer improperly to be jailed as a felon for a long time. These unjust laws colour the debate so much in the US that some people react by saying that all platforms should be completely open, so that no one can be said to use them improperly. Hopefully these laws will be fixed through debate about how to balance the needs of creative people to be paid and the needs of consumers to be able to contribute but also to be able to rip, mix, quote and archive this material.
</p>
<h2>
Open Source
</h2>
<p>

â€œOpen Sourceâ€� is another way â€œopenâ€� is used on the web, one which has been and is very important to the Webâ€™s growth. Itâ€™s important to me that I can get at the source code of any software Iâ€™m using. If I can get at the source code, can I modify it? Can I distribute the modified code and run it on my machine?  As Free Software Foundation lead Richard Stallman puts it, â€œfree as in freedom rather than free as in beerâ€�.
</p>
<h2>
Open Access
</h2>
<p>

Open Access is a Web-based movement specifically about free (as in beer) access to the body of academic learning. Governments, and therefore taxpayers, pay for research via grants but often the results of the research are kept in closed-access academic journals. The results are only available to those at big universities. The poor and those in remote rural areas cannot participate.
Open Access journals are academic journals legally and technically available openly on the Web at zero cost.
</p>
<p>

These have to be funded either by publication fees, which in turn have to be agreed to by research funders, or through the implementation of a very low cost Web-based system. Nowadays, governments (with the US NIH taking a lead), and the European Commission are starting to require open access to the results of taxpayer-funded research.
</p>
<h2>
Open Internet and Net Neutrality
</h2>
<p>

When we talk about keeping the internet free and open, we are often worried about blocking and spying. One of the ways in which we protect the Web is by ensuring Net Neutrality. Net Neutrality is about non-discrimination. Its principle is that if I pay to connect to the Net with a certain quality of service, and you pay to connect with that or a greater quality of service, then we can both communicate at the same level. This is important because it allows an open, fair market. Itâ€™s essential to an open, fair democracy. The alternative is a Web in which governments or large companies, or frequently a close association of the two, try to control the internet, with packets of information delivered in a way that discriminates for commercial or political reasons. Regimes of every sort spy on their citizens, deriving hugely accurate and detailed profiles of them and their intimate lives. Today, the battle is building.  The rights of individual people on the Web are being attacked, and at the moment only a few 
 few people really understand and realize what is going on.
 </p>
<p>

The World Wide Web turns 25 next year. We have come a long way, but we must all continue to push for these various forms of openness in the appropriate places. Only then can we ensure that the Web is for everyone.
</p>

<h2>2024-09-28 Update: Open Process</h2>
<p>
Many the projects I've been involvd, or depended on, are developped with an open process.
The W3C and the IETF are open processews where there are non-profit Organizations to run them.
Many open source projects, are based on open processes aroud a shared code repository on something like GitHub or Gitlab. 

Hamish Campbell includes them in his essay on 4 opens[HC]: 

</p>

<blockquote>Open process refers to the transparent and participatory decision-making
  processes that govern [tech] projects. By involving stakeholders in project planning,
  development, and governance, open process fosters trust, accountability, and collective ownership.
  Progressive social and tech initiatives can embrace open process by adopting democratic
   and inclusive decision-making structures, such as consensus-based decision-making or participatory budgeting.
    For example, open process can enable community-led initiatives, 
    address social justice issues, and promote collective well-being.
</blockquote>
    <p>He mentions tech projects, but there is no reaon it has to be a tech project. It could be any project.

    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 01 Feb 1999 00:00:00 GMT</pubDate>
  <title>&quot;Paper Trail&quot; -- build social and commercial systems out of linked immutable documents</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PaperTrail.html</link>
    <guid>https://www.w3.org/DesignIssues/PaperTrail.html</guid>
      <description><![CDATA[
    <h1>
      Paper Trail
    </h1>
    <p>
      Here we look at the relationship between documents (living or
      dead but basically bits of state) and messages (events with
      associated data, including typically but not essentially
      sender and recipient).
    </p>
    <p>
      Here is a proposal for a project: "Paper trail" state machine
      for workflow. The concept here is that the state of any
      transaction is in the real world (and in this formalization
      in the Web) just a function all the messages which form part
      of a protocol.
    </p>
    <blockquote>
      <h3>
        Epilogue (2001/05)
      </h3>
      <p>
        The <a href="/2001/01/WSWS">Web Services workshop</a>, in
        discussing transactios over the Net, surfaced the need for
        process flow descriptions
      </p>
      <h3>
        Update (2004/03)
      </h3>
      <p>
        The <a href="/2000/10/swap/">Semantic Web Application
        Platform (SWAP)</a> now has enough functionality to
        implement these ideas. see <a href="/2000/10/swap/ppt-bank/">ppt-bank</a>, especially <a href="/2000/10/swap/ppt-bank/checking.n3">checking.n3</a>
      </p>
    </blockquote>
    <h2>
      Introduction
    </h2>
    <p>
      Social processes look like state machines. However, they
      don't exist as a state variable stored in one place, but as a
      trail of documents. You know the true state of the machine
      only if you have access to the latest documents. (This is not
      the problem addressed here, this is real life being
      modelled.) <em>Paper-trail</em> is a system which allows one
      to follow a strict process by creating new documents in a
      constrained fashion. Every paper-trail document has a pointer
      to a "paper-trail schema" which defines its document type (eg
      "constitutional amendment") a pointer to its justification
      documents (maybe) a notarization of when it was checked
      against the schema by the paper-trail program. The schema
      defines:
    </p>
    <ul>
      <li>Prerequisites for a document being valid, in terms of
      other documents
      </li>
      <li>Hints to other document types you can make from this one
      (state transitions)
      </li>
    </ul>
    <h3>
      Example
    </h3>
    <blockquote>
      <p>
        To make a new W3C working draft, the schema requires
        pointers to old working draft new document, and editor's
        authorization. The editor must be defined as editor on home
        page of working group where working group page is pointed
        to be by old draft. If all those exist, then the new
        document is created from all that and notarized (time
        stamped) by the software. The human readable part of the
        document is created as a (simple macro) function of the
        input documents. A document also has a buttons to take you
        to a form to turn it into another type of document
        according to hints in the schema.
      </p>
    </blockquote>
    <h3>
      Example
    </h3>
    <blockquote>
      <p>
        A button on a Working Draft takes you to a form for
        promoting it to a "proposed recommendation". This requires
        different things (all the above plus endorsement of new
        draft by director or any two members of the management
        group.)
      </p>
    </blockquote>
    <h2>
      Technology
    </h2>
    <p>
      If you are considering this as a student project, consider
      these directions:
    </p>
    <ul>
      <li>Use RDF within the document to express its state.
      </li>
      <li>Develop declarative language for defining the
      prerequisites - ideally in RDF too.
      </li>
      <li>Develop GUI for creating a new document by supplying the
      prerequisites
      </li>
      <li>Allow hooks for digital signature but don't have to
      implement it
      </li>
    </ul>
    <h2 id="Generalizi">
      Generalizing for formal protocols
    </h2>
    <p>
      The concept of a paper trail is common in conventional
      administration, but the model can also be applied to
      well-defined computer protocols.
    </p>
    <h2 id="Model">
      Model
    </h2>
    <p>
      The model is that a protocol P defines a status s<sub>n</sub>
      as a function of a message m and a previous state
      s<sub>n-1</sub>, and the time t.
    </p>
    <p>
      s<sub>n</sub>= P(m<sub>n</sub>, s<sub>n-1</sub>, t)
    </p>
    <p>
      or for that matter as a function of all the messages to date
    </p>
    <p>
      s<sub>n</sub>= P'({m<sub>i</sub>}<sub>i=1..n</sub>)
    </p>
    <p>
      The state could be a logical formula, an RDF graph, or an XML
      document, or just a number, in decreasing order of interest.
      The system can be a any one of a number of types of machine,
      including the well-known finite state machine and push-down
      automata.
    </p>
    <p>
      In an XML world, think of the state and the messages all
      being expressed in XML, and the protocol maybe being an XSLT
      script.
    </p>
    <p>
      The state must record everything necessary for calculating
      future states for any new message. It could also record the
      results of the protocol. For example, the state of TCP (where
      IP packets are the {m} ) must hold the state of the packets
      unacknowledged in the sliding window, but when the connection
      has been successfully closed it could hold either just
      "terminal state", or also the ordered set of bytes
      transferred in the connection.
    </p>
    <p>
      The protocol function can be seen as an information
      destroying function. By specifying what needs to be
      remembered, it defines what can be thrown away. This is of
      course very important. Of course, one might in some cases
      still want to spool the messages for security, but the actual
      information needed to describe the state of affairs is
      limited..
    </p>
    <p>
      Typically, to be valid, messages will link back to previous
      messages either directly or though common threading
      identifiers of some sort. A message without such a reference
      will in most cases not have any effect on the state.
    </p>
    <p>
      There will in general be error states, which the protocol
      does not allow, which any message which is invalid in some
      way will lead to. Functionally there need only be one error
      state but in practice one might want t preserve the state
      before the error and details of the error. Some protocols
      model most errors themselves by sending.
    </p>
    <p>
      There must obviously be a set M<sub>0</sub> of valid ways to
      start a protocol in the first case from the generic initial
      state s<sub>0</sub>. For example, in TCP one sends a SYN
      message; on the telephone one picks up the receiver. For any
      m in M<sub>0</sub>, P(m, s<sub>0</sub>) will be a valid
      (non-error) state.
    </p>
    <p>
      There will in some systems be a set of F final states, in
      which no further messages can have any effect on the state.
      For any s in F, P(m,s) = s for all m.
    </p>
    <p>
      For example, in the US, when 7 years have passed since a
      transaction occurred, then all records may be discarded as no
      one even the tax man has the right to query them. The state
      is reduced to a minimum. Most systems can be modelled in a
      simple of complex way, the simple way ignoring a lot of the
      auditing processes for example. A simple model of a loan
      between two people has a state which is the balance amount
      and one final state when that is zero. Other systems are
      designed to remain in non-final state: a lifetime warranty is
      a protocol which remains in non-final state (until you die!),
      waiting for any message that you are dissatisfied with the
      product.
    </p>
    <p>
      Real system are part of bigger systems, and so the real
      protocol will function as part of a larger protocol. For
      example, a working group at W3C goes though many internal
      state changes, and (on a simple model) the last is when their
      work is accepted by the Consortium as a whole as a
      Recommendation. This is a message leaving the system, which
      forms part of the larger protocol. Modeling this is clearly
      interesting. (To demonstrate this nesting by an example of it
      breaking, think of the case of a working group not arriving
      at consensus and passing on not only a final document but
      also a minority report, basically a peek into the internal
      workings of the group which did not in fact arrive in its
      final state. ) This would include modelling tasks which can
      split, and be recursively delegated, and so on.
    </p>
    <h2>
      Cool things
    </h2>
    <p>
      This system can allow well-defined social processes to work
      eg on a net newsgroup, or by email. ie, it works in a
      write-only medium.
    </p>
    <p>
      It models real life in commerce well, where the state really
      is an abstract thing and one's perception of it depends on
      the set of messages one has had access to.
    </p>
    <p>
      Hopefully we can use this model to define systems which are
      even more powerfully distributed than any we use at the
      moment.
    </p>
    <h2 id="Linking">
      Linking Remote operations and Data Formats
    </h2>
    <p>
      I must have discussed the relationships between remote
      operations and data formats before. Maybe I have made a table
      with schema languages compared against interface definition
      languages, and so on.
    </p>
    <p>
      Now we have a clear way of expressing the relationship
      between the two. A Protocol definition document defines a
      document as a function of messages, which can be represented
      as documents - so we can look at remote operations in terms
      of documents. Typically RPC messages are very constrained:
      this model allows much more complicated multi-party protocols
      to be defined.
    </p>
    <h2>
      Challenges if you finish early
    </h2>
    <p>
      If making a paper trail machine was fun, here are some more
      ideas.
    </p>
    <ul>
      <li>Add time-aware social processes such as promises and
      timeouts.
      </li>
      <li>Do you need to be able to prove non-existence of
      documents?
      </li>
      <li>Locally to an author or globally?
      </li>
      <li>States can split. (draft can go to W3C or IETF process or
      both).
      </li>
      <li>How can you limit this, when socially undesirable?)
      </li>
      <li>Develop proofs that processes will achieve given ends.
      </li>
      <li>Model processes near you:
        <ul>
          <li>auction
          </li>
          <li>peer review journal
          </li>
          <li>presidential impeachment ;-)
          </li>
          <li>internet newsgroup creation
          </li>
          <li>formation of a company
          </li>
          <li>MIT purchasing (possible PhD thesis ;-)
          </li>
        </ul>
      </li>
      <li>Develop theories in which players are
        <ul>
          <li>collaborative
          </li>
          <li>competitive
          </li>
          <li>allowed to create new schemas to achieve their ends
          </li>
        </ul>
      </li>
      <li>Model existing systems near you:
        <ul>
          <li>TCP
          </li>
          <li>HTTP...
          </li>
        </ul>
      </li>
      <li>Develop a protocol machine, which, acting on behalf of
      one agent, will determine when that agent has a possible move
      to make, and when in fact the protocol is acting for that
      agent. Develop a GUI which helps a human user chose from the
      set of possible options at that state of the protocol.
      </li>
    </ul>
    <h2 id="Products">
      Products
    </h2>
    <p>
      The thing which would come out of this idea would I imagine
      be a standard language for writing protocols. Of course, it
      would mainly be something else, such as an rdf-logic
      language, or prolog or whatever, but there would have to be
      hooks to define it to be a definition of a protocol.
    </p>
    <p>
      This takes the self-describing web concept into a new area:
      that messages are self-describing in that they contain a
      pointer to the language in which they are written, and that
      includes (or points to) the protocol to which they claim to
      adhere.
    </p>
    <p>
      @@ Add pointers to work done with Notation3
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Oct 2000 00:00:00 GMT</pubDate>
  <title>Persistent Domains</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PersistentDomains.html</link>
    <guid>https://www.w3.org/DesignIssues/PersistentDomains.html</guid>
      <description><![CDATA[
    <h1>
      Persistent Domains
    </h1>
    <p>
      This is a proposal to address the problems with the
      persistence of HTTP URIs. It introduces the concept of a
      datestamped domain name with associated rights and
      obligations of ownership.
    </p>
    <h2>
      Introduction
    </h2>
    <p>
      &nbsp;The problems of the lack of persistence of URIs lead to
      many well-known problems, including
    </p>
    <ul>
      <li>user frustration and social dysfunction&nbsp;with Error
      404 messages; and
      </li>
      <li>(worse) the dereferencing of a URI using HTTP/DNS leading
      to completely different resource to that intended by the
      referring person.
      </li>
    </ul>
    <p>
      &nbsp;A&nbsp;second-order symptom of the problem is that
      there is a constant stream of proposals for new URI schemes
      with different, incompatible, name-lookup technology. These
      are often made with less attention to the real social issues
      surrounding persistence than HTTP/DNS, but propose to be
      systems otherwise similar except in having greater
      persistence.
    </p>
    <h2>
      The analysis
    </h2>
    <p>
      The persistence of HTTP URIs can be factored into two issues:
    </p>
    <ol>
      <li>The persistence of the opaque string which follows
      the&nbsp;domain name, and
      </li>
      <li>the persistence of the domain name&nbsp; itself.
      </li>
    </ol>
    <p>
      The first of these is an&nbsp;issues is of course under the
      control of the domain owner.&nbsp; This, combined with a
      dearth of tools which help one run a web server with
      persistent URIs, has led to a vastly varying level of
      persistence.&nbsp; Some sites understand and construct their
      URIs carefully, while some obviously have not thought about
      the problem and end up changing URIs out of thoughtlessness
      rather than malice. I have summarized some aspects of this
      problem in "<a href="http://www.w3.org/Provider/Style/URI"><em>Cool URIs Don't
      Change</em></a>" .&nbsp; This relies on publishers making an
      institutional commitment to persistence.&nbsp; I have tried
      to lead the way with W3C's draft <a href="http://www.w3.org/Consortium/Persistence">Persistence
      Policy</a>. However, this only addresses the first element.
    </p>
    <p>
      The second&nbsp;issue is summarized by ICANN chair Ester
      Dyson's line, <em>You don't buy domain names, you rent
      them</em>. This single&nbsp;meme instantly undermines any
      public expectation that domain names should be persistent. In
      principle, if&nbsp; <a href="http://www.example.com/downloads">http://www.example.com/downloads</a>
      today points to example.com's download page, tomorrow, it
      could point to anything as defined by a company which has
      bought, acquired though law suit or accidental expiry the
      domain name "example.com".&nbsp;&nbsp;&nbsp;In fact, however,
      huge numbers of links are being made with HTTP. The planet's
      investment in domain names in references is huge. The
      technology&nbsp;which actually is involved
      in&nbsp;dereferencing such identifiers has become more and
      more&nbsp;sophisticated.&nbsp; In practice, also,
      the&nbsp;legal weight&nbsp; behind a&nbsp;significant
      organization's ownership of a domain name is
      considerable.&nbsp; No one would dream that a legal battle
      <em>microsoft.com</em> or <em>ietf.org</em> would be lost
      to&nbsp;some&nbsp;sneaky entrepreneur.&nbsp; Through the rift
      between trademark law and domain names is a problem, there in
      fact is strong legal support for an organization's ability to
      keep its name. But is this enough?
    </p>
    <h2>
      The Solution
    </h2>
    <p>
      I think we can do better. We can do better, if this scheme
      works out, on both issues.&nbsp; To tackle the second, we can
      create domain names which are allocated once and once only,
      which are bought, not rented. We can simultaneously set
      expectations that such data will endure, that names will not
      be reused, and that information will be available even after
      organizations involved have disappeared.&nbsp; The trick is
      the same one as used in W3C's datespace URIs (such as
      <a href="http://www.w3.org/2000/01/sw">http://www.w3.org/2000/01/sw</a>)
      and British car registration plates: to date code them. For
      example, let us create a series of top-level
      <strong>persistent domains</strong> y2000, y2001 and so on.
      One would only be able to acquire a .y2000 domain name&nbsp;
      during the year 2000. Once acquired one would have it
      forever. In fact, the domain name would correspond forever to
      the information published in it.&nbsp;
    </p>
    <p>
      We could take the precaution of inserting a few rules for
      sanity here.&nbsp; There should be some combination of due
      diligence to ensure you are not infringing trademark when
      applying for an alphanumeric name. I would like to put a
      limit on the number of persistent domains any company could
      own - or at least put those who have n in the queue ahead of
      those who have k&lt;n.&nbsp; There could be a convention that
      if you are happy with a random numeric domain 6872364.y2000
      you can have one immediately and automatically.&nbsp; It
      would be great to put some constraint on sitting on a domain
      without using it, and maybe on transfer of domain ownership.
    </p>
    <p>
      To tackle the first issue, an organization wishing to enter
      the scheme makes necessarily a few commitments. One is that
      it must partake in some cooperating mirroring scheme in which
      other organizations or commercial services take on running
      mirrors of the site.&nbsp; I can imagining this working, for
      example, as a for-pay service for consumers, or a mirror-ring
      system for academic institutes. There must be some form of
      contract which, in the event of the original publisher of the
      information coming to a voluntary or involuntary demise, the
      mirror sites will continue the service.&nbsp; The documents
      then enter an <strong>archive state</strong> in which the
      original publisher loses authority to evolve the domain and
      the public gain rights of access. The actual contractual
      arrangements will in fact have to be skillfully set up. For
      example, there will be some information which will be just so
      uninteresting that no one will be prepared to pay for it in
      the long run, but there must be some way to give serious
      archival institutions (the major national libraries for
      example) the right to take over an archive for the public
      good.&nbsp; However, it would be best to start with simple
      conditions, but allow them to be modified with experience
      (real and&nbsp;imagined - <em>gedanken</em>) of the
      system.&nbsp; Other interesting things which come to
      mind&nbsp;include the mirroring of confidential access
      controlled documents with a 30 year timeout on the
      confidentiality.
    </p>
    <p>
      I don't know how best to enforce that a URI is never
      reallocated to something else by a publisher. Actually, I
      don't think it will be a long term problem, as the tools will
      be made such that it is not a function. ("Rename: command not
      known"!).&nbsp; Obviously&nbsp;re-use would cause a big mess,
      as caches across the globe would run out of sync, and
      assumptions made by mirrors and proxies would become
      invalid.&nbsp; Perhaps a suitable disposition for a domain
      whose publisher consistently flaunts the rules would be for
      it to be more or less automatically declared dead, and for it
      to pass into archive state just as though the organization
      had passed away. The organization can then ask for a new
      domain and start again - at least the first time.
    </p>
    <p>
      A clarification of what re-using a URI means.&nbsp; As
      I&nbsp;have pointed out many times, most URIs are <a href="http://www.w3.org/DesignIssues/Generic">Generic URIs</a>.
      They refer to, not a fixed set of bits, but a conceptual
      resource whose representation may vary with (for example)
      time, technology, language, and so on.&nbsp; This is fine,
      even in a persistent domain. It is in my opinion important to
      distinguish (and ideally independently identify with separate
      URIs) generic resources and any specific representation in
      use in a specific case.&nbsp; The contractual obligations of
      ownership of a persistent domain could usefully include the
      provision of such metadata, and the allocation of a
      persistent URI to the actual (mime-type, octet-string)
      specific entity.
    </p>
    <p>
      Obligations of persistence&nbsp;would have to
      apply&nbsp;across the transfer of the persistent domain. The
      persistent domain would be property, with (like land) rights
      and obligations. Transfer of the domain name would carry
      within it both.&nbsp; Fleet Bank would not be able to buy
      BankBoston and simply drop support for documents
      bankboston.y2000 - without&nbsp;all public documents falling
      into archive state.
    </p>
    <p>
      Another clarification.&nbsp; A document in archive state has
      a certain general license to mirror and reproduce in
      unmutilated form, but the intellectual property rights remain
      as ever with the author. A very tempting rule (which I am not
      convinced of yet) is that once the original domain owner has
      defaulted on publishing the material, that it acquires some
      limited redistribution license<a href="#Print"><sup>*</sup></a>.&nbsp; This is the Web equivalent
      of an implicit right to photocopy any book out of print: the
      publisher has deigned not to make copies, and it impedes the
      operation of society to prevent this erstwhile public
      information from being accessible.
    </p>
    <h2>
      Summary&nbsp;
    </h2>
    <p>
      Here is a collection of simplified rules which would form the
      protocol of persistent domains.
    </p>
    <ol>
      <li>Domain names *.y2000 only allocated in 2000, and so on;
      </li>
      <li>Some level of trademark due diligence before registration
      for alpha numeric names
      </li>
      <li>Random numeric names&nbsp;&nbsp; 123678.y2000 issued
      immediately but not transferable
      </li>
      <li>A limit on the number of persistent domain per
      organization would be useful
      </li>
      <li>Domain names owned for life and persist forever
      </li>
      <li>URIs may be live or frozen but not reused arbitrarily
      </li>
      <li>Implicit irrevocable license for 3rd parties to mirror
      public info now and after death
      </li>
      <li>The registry(-ar - whatever is the controlling authority,
      I forget the terminology) would be a neutral non-profit
      cooperative.
      </li>
      <li>ICANN's delegation of .y[1-9]* would be irrevocable in
      all time.
      </li>
      <li>Anyone using it would pay me $1 (just kidding! :-)
      </li>
    </ol>
    <p>
      Provided these are put together with sufficient care, the
      system should run itself in such a way&nbsp;as to preserve
      our information world for posterity,&nbsp;be it represented
      by a consumer in&nbsp;a decade searching for instructions
      from the now-defunct manufacturer of an appliance, or by a
      historian&nbsp;in&nbsp;a millennium&nbsp;trying to figure out
      what on earth made us all tick way back then.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 16 May 2024 00:00:00 GMT</pubDate>
  <title>Link in Bio: Personal data which is Public</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PersonalPublic.html</link>
    <guid>https://www.w3.org/DesignIssues/PersonalPublic.html</guid>
      <description><![CDATA[
  <p>
  So much has been written about (and implemented in code) around open public data and personal private data,
  but there is an important case of people's personal public data which has almost been overlooked.
  This is not important for everyone, but for a lot of people, particularly professional people,
      the data the world knows about then is important to them and to their life.
      People may want to to put anything they are proud of in (or linked from) their bio, including
      achievements, resumé, skills, and languages spoken. Also for safety medical such as allergies, blood type, and next of kin.
      It is also useful to link to other sources of information, like other social network accounts.
</p>


<p><a href="https://www.w3.org/DesignIssues/PersonalPublic.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 29 Jul 2002 00:00:00 GMT</pubDate>
  <title>Philosophical Engineering and Ownership of URIs</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PhilosophicalEngineering.html</link>
    <guid>https://www.w3.org/DesignIssues/PhilosophicalEngineering.html</guid>
      <description><![CDATA[

<h1>Philosophical Engineering and Ownerhip of URIs</h1>

<p class="q">Harry Halpin: How did the idea of philosophical engineering come about?</p>

<p>Tim Berners-Lee: The phrase came about when we were originally discussing the idea of Web Science, and I was tickled by the fact that when you study and take exams in physics at Oxford, formally the subject is actually not physics but experimental philosophy. I thought that was quite an interesting way of thinking about physics, a kind of philosophy that one does by â€œdropping things and seeing if they continue to dropâ€� â€“ in other words, â€œthinking about the stuff you do by dropping things.â€� Then it came up again when trying to explain to people that when we design Web protocols, we actually get a chance to define and create the way a new world works. It struck me what we ended up calling â€œWeb Scienceâ€� could have been called â€œphilosophical engineering,â€� because effectively when you create a protocol you get the right to â€œplay Godâ€� and define what words mean. You can define a philosophy, to define a new world. So when people use your system, when they run the protocol, to a certain extent they have to leave their previous philosophy at the door and they have to join in and agree they will work with your system. So you can build systems, worlds, which have different properties. That's exciting, and a source of responsibility as well.</p>

<p class="q">Harry Halpin: Would you consider the creation of Web standards to be an act of philosophy in progress?</p>

<p>Tim Berners-Lee: Certainly, when people write a specification, they argue about what words mean until everyone assumes that they mean in some sense the same thing. When the concepts in different people's brains have been sufficiently well-aligned and there have been enough connections between the concepts, this is written down in a language that people feel comfortable with and that they share. You can if you want to philosophically argue that a word is in fact ambiguous, but nobody bothers. Understand when you play the game [of specification-writing], you're not going to argue about that. For example, you're not going pay a bill online, and then afterwards come back and say â€œWell, I sent some HTTP headers off, but because their just HTTP headers, they don't actually mean anything.â€� As a spammer said once, â€œIt's just a form field, I can put whatever I like there, it doesn't have to be the person sending the email.â€� But it does if your playing the game! I think one of the things we're missing is the relationship between the law of the land and protocols. It should be easier to establish that when someone disobeys a protocol that they've broken the law via a straight-forward path.</p>

<p class="q">Harry Halpin: One of the most important aspects of natural language is that it's composed of words. In contrast, the Web is a space of URIs. How is it that URIs and their meaning differs from other possible systems like natural language? What is special about URIs?</p>

<p>Tim Berners-Lee: There are many URI schemes, but one thing that is nifty about HTTP URIs is that they have domain names in them. So they're hierarchical, and a domain is something that one can own. In the way the protocol works, the owner of the domain has the right to say, and the obligation to say on the Semantic Web, what the things in that domain mean. It's not a question of philosophical discussions between third parties. If there's a dispute about what a URI stands for, then the way the protocol works is that you go to the person who owns the domain name, who typically delegates it to someone else, who has in turn designed an ontology that they store on a Web server. The great thing about the Web is that you can look up the HTTP URI in real-time to get some machine-readable information about what it means straightaway.</p>

<p class="q">Alexandre Monnin: Regarding names and URIs, a URI is not precisely a philosophical concept, it's an artifiact. So you can own a URI while you cannot own a philosophical name. The difference is entirely in this respect.</p>

<p>Tim Berners-Lee: For your definition of a philosophical name, you cannot own it. Maybe in your world, in your philosophy, you don't deal with names that are owned, but in the world we're talking about, names are owned. Some people have a philosophy where they they find it useful to think of a name as just a function of use, not of definition. Other people work in worlds like lawyers, where the model is that there is a definition, a legal definition of a term. There's enough law to insist that while meaning is use, but use is according to definition, because otherwise people could get put in jail. So there are models, and now we're adding another one, in which meaning is defined by the owner of a name.</p>

<p class="q">Harry Halpin: Wasn't it controversial that when the Web was first starting that everything could be named with a URI?</p>

<p>Tim Berners-Lee: At the IETF certainly there was resistance. I originally called these things â€œUniversal Document Identifiersâ€� (UDIs) even before we started using them for concepts. The IETF were a bit put off, thinking it was too much hubris to call them â€œuniversalâ€� but now I realize that I should have held firm and said â€œbut they are,â€� as any alternative system of naming you can make out there, I can map it to the character set we use in URIs and I can invent a new scheme for it. So because we can map any scheme, we'd already mapped Gopher and FTP and these things. Now we've got HTTP and there will be lots of other schemes. So in a sense it is universal, we're saying anything, any name that you come across, can be mapped into this space. So yes, there was a lot of pushback against that, and hence the â€œuniformâ€� rather than â€œuniversalâ€� in URIs.</p>

<p class="q">Alexandre Monnin: Given the origins of philosophical engineering and Web Science, don't you think that Web Science is doing two things? The Web is an artifact, we produce it, we implement it, and as you said, we decide what the protocol means and how it should be used. On the other other hand, Web Science is a science, so we make discoveries and we are also surprised by our own creation.</p>

<p>Tim Berners-Lee: The Web Science cycle starts off with idea that the design of the Web is not just the design of one thing but the design of two things; for example in email, there's a general technological protocol like SMTP and there's a social protocol. In email, there's a social protocol that states that everyone involved is ready to run a machine that has the space to store all your friends emails while they are en route to its destination, that people will send email to each other on perfectly reasonable topics, and that people will read email that they receive. There's a social piece of email, then e-mail is actually pushed around with SMTP and pulled off with IMAP, and those pieces then together form a system. It's a microscopic system that explains how one person sends another person an email through a finite number of hops, but then you get the effects of scale. So the engineering of Web Science is not like building a mousetrap. You design a microscopic system, but what you're interested in is the macroscopic phenomena that emerge. So when you do the science, the analysis, and the whole rest of the cycle for e-mail, you look at what is happening and notice â€œSpam has happened, oh dear!â€� What went wrong? One of our social assumptions was wrong, namely that everybody is friendly and will only send email to another person when the other person wants to read it. So the academic assumption is broken, and we have to redesign e-mail, and interestingly no-one has really succeeded in redesigning either the social or technical piece of mail to make spam go away. So there's an example: there's a design piece and an analysis piece, there's an engineering piece and a science piece, one being done on the microscopic system and the other being done on the microscopic system, and we're missing a lot of the mathematics that let us understand the connection between the two levels.</p>

<p class="q">Harry Halpin: What is the role of philosophy in Web Science? Is there such a thing as a philoosophy of the Web?</p>

<p>Tim Berners-Lee: An awful lot of philosophy in the past has been in a wasted as it was done before we understood evolution. We were trying to understand emotions, and now we can point to evolution producing mammals with emotions. A lot of philosophy in the past is inapplicable. A lot people might say that philosophy is irrelevant to daily life, but if a Working Group stops and people start arguing about what things really mean, and people refuse to play the game, refuse to say what terms mean, and they don't do their job to define a protocol properly, then its a philosophical task to point out to them that this is important. Also, philosophy may be necessary to explain when the legal system hits the Web. When you make a web-page you can link to anything, you can write anything about it, but when a lawyer comes along and reserves the right to charge you to link to their page, then in a way it's a philosophical question, because you have to tie it to the way the protocol is defined over a name, just a reference, something that has never been controlled over the millennia. Systems where you control names haven't worked so far, and so you need the philosophy to show how these protocols ground out in legal history, in concepts for using names lawyers understand.</p>

<p class="q">Alexandre Monnin: What do you expect from philosophy of the Web?</p>

<p>What I would like for philosophers to do is to work diligently and to produce very nice, simple, documents which describe to people like computer scientists how things work in a simple way. What happens when you click on a link? Quite a lot of that is philosophy. So, I'd for you to have enough of a body of understanding, so when people in a Working Group stop and say â€œWait, this doesn't match what I learned from Wittgensteinâ€� that you can say â€œNo, please go read this pamphlet, its about philosophical engineering and it explains the philosophy of what your doing, so you won't find Wittgenstein very useful in this case or these are the bits that you will find useful.â€� So if you can produce enough discussion and understanding so that we don't have to stop work for philosophical discussions and we can rely on it being there, that would be excellent.</p> 





    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2022 00:00:00 GMT</pubDate>
  <title>Pod Stuff: What&apos;s in a Pod, and how is it indexed?</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PodStuff.html</link>
    <guid>https://www.w3.org/DesignIssues/PodStuff.html</guid>
      <description><![CDATA[
  <p>
  What should be in your Solid Pod?  Anything, of course.
  Any data in your life, across the data spectrum from public to private
  and everything shared with communities in between.
  The activities in your life you currently do on your phone, on on a web app,
  on or on any device at all. And activities which currently you can't really do
  online but you will be able to in future.
  This data will vary hugely in size and shape, and immutability, and it's needs for speed, and
  it's needs for security, may be very different.
  But still it must be organized in clear way, extensible way, and
  a way which allows interoperability between different application.
  To be extensible across these dimensions,
  we must give applications the power to configure this data in the users pod in a way
  which meets these requirements, and also works well for that particular app.
  We expect different apps to often shared common design patterns for similar types of data.
  We do this by principally initially keying data by its RDFS Class, not by the particular app which wrote it.
  But then different apps need to be able to create very specific structures including many different types.
  We expect the classes of data used by different apps to overlap a lot.
  We track and use its provenance carefully and carefully control is access, to be able to model the
  way we really trust data, people and bots in this very rich, varied, and valuable system.
  Keeping stuff related to the same activity together in a pod,
  even though it is of many different types, is valuable, because the same access and trust
  conditions apply and it is much simpler to get those right.
</p>


<p><a href="https://www.w3.org/DesignIssues/PodStuff.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 01 Oct 1998 00:00:00 GMT</pubDate>
  <title>Preface</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Preface.html</link>
    <guid>https://www.w3.org/DesignIssues/Preface.html</guid>
      <description><![CDATA[
    <p>
      <a href="Overview.html">Design Issues</a>
    </p>
    <h1>
      Preface
    </h1>
    <h2>
      Architectural and philosophical points
    </h2>
    <blockquote>
      <i>These statements of architectural principle explain the
      thinking behind the specifications. These are personal notes
      by Tim Berners-Lee: they are not endorsed by W3C. They are
      aimed at the technical community, to explain reasons, provide
      a framework to provide consistency for for future
      developments, and avoid repetition of discussions once
      resolved.</i>
    </blockquote>
    <p>
      I have found that, having started this set of notes in 1990
      in the (for me) novel medium of hypertext, it has been
      difficult to tear free of it: my attempts to lend hierachical
      or serial order have been doomed to failure. Further, as
      ideas and these web pages have evolved, it has been important
      for me to be able to reorganize my thoughts, grab a new leaf,
      shake the tree and regard it as the root. So the reader needs
      to be aware of this, that each page may be an attempt to put
      across a given concept serially, but if you are looking for
      an order of concepts and subconcepts, you have as much hope
      as you would with words in the dictionary. I can sympathise
      with Ted Nelson whose <cite>Litterary Machines</cite> has "a
      Chapter Zero, several Chapters One, one Chapter Two, and
      several Chapters Three", not to mention with Ludwig
      Wittgenstein whose <cite>Philosophical Investigations</cite>
      have only paragraph numbers for structure.
    </p>
    <p>
      The notes are in a constant state of flux, sometimes minute
      by minute, sometimes decade by decade. Their status varies -
      some have typos and spelling errors, and represent thoughts
      half expressed, wheras others described resolved issues which
      have become fundamental architectural decisions in the
      conceptual infrastructure of the Web. Again, something in me
      resists the urge to draw a line and move things from here
      into a "done deals" space. I try to represent accurately the
      status of a given page in the section above the rule at the
      top. Definitive documents, reviewed by W3C members and
      others, you will find elsewhere.
    </p>
    <p>
      Neither have I found it easy to restrict myself to separated
      technical or philosophical arguments and somehow this I feel
      is also important, the sharpening happening, after all, where
      the knife meets the stone.
    </p>
    <p>
      I did draw a line between the really old ones whose dates I
      couldn't necessarily even find, and which were too out of
      date to find themselves linked into any current discussion.
      Hence the brown archival section on the contents page, and
      the brown archived notes it points to. These are really only
      available for completeness of archival, and not suggested
      reading. The other remarks here do not apply to them.
    </p>
    <p>
      For all its (or because of its) lax flexibility, I have
      personally found this space a useful one. I have used it to
      place opionions and explanations which I have needed to
      express, and have found it useful to be able to express them
      later to others. But also I have found it a personally useful
      excercie to review the state of order and disporder from time
      to time, part of the intuitive process of making a new step.
      But that is all personal use and, and for the hestiations I
      have just outlined, I have never felt that the whole
      collection has been worthy of recommeding as reading as a
      work in itself.
    </p>
    <p>
      Tim BL October 1998
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 03 Oct 2023 00:00:00 GMT</pubDate>
  <title>Pretty Print</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Pretty.html</link>
    <guid>https://www.w3.org/DesignIssues/Pretty.html</guid>
      <description><![CDATA[
    <p>"Pretty Printing" is the practice of formatting code in a
    computer language so that it is easy to read by humans, and
    aesthetically pleasing. There is a <a href="https://en.wikipedia.org/wiki/Prettyprint">Wikipedia
    article</a> which goes over it well in general. In the case of
    data in the Solid ecosystem there is value not only in
    legibility but also in making small changes to data evident as
    small changes to the text, and in tests as a simple more or
    less canonical form to easily compare expected and actual
    results. And specifically for RDF data in N3 or Turtle, those
    languages provide powerful features for representing a graph
    data in a form close to English language. Legibility and
    aesthetic properties affect not only reading code but also
    people writing it. If are expressing by hand instance data,
    from random facts through configuration files to ontologies and
    rules, being able to write it in a clear, English like way also
    is more pleasant, and easier to check for mistakes. When a
    group is collaborating the same same data document, such as on
    a whiteboard or in a chat, then legibility and writability are
    both important.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Pretty.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Sep 1998 00:00:00 GMT</pubDate>
  <title>Principles of Design</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Principles.html</link>
    <guid>https://www.w3.org/DesignIssues/Principles.html</guid>
      <description><![CDATA[
    <h1>
      Principles of Design
    </h1>
    <p>
      Again and again we fall back on the folklore of the
      principles of good design. Sometimes I need a URI for them so
      this is started as collection of them. I have written about
      some in many places. Principles such as <b>simplicity</b> and
      <b>modularity</b> are the stuff of software engineering;
      <b>decentralization</b> and <b>tolerance</b> are the life and
      breath of Internet. Brian Carpenter has enumerated some
      principles of design of the Net [<a href="Architecture.html#carpenter">carpenter</a>]. The third pair
      of ideas I have found commonly useful for the Web. I
      mentioned them in a keynote at WWW7 and the note on <a href="Evolution.html">Evolvability</a>.
    </p>
    <p>
      This is largely "motherhood and apple pie" but it still needs
      a home.
    </p>
    <h2>
      <a name="KISS" id="KISS">Simplicity</a>
    </h2>
    <blockquote>
      <p>
        "Keep it simple, stupid!"
      </p>
    </blockquote>
    <p>
      Simplicity is easily to quote but often ignored in strange
      ways. Perhaps this is because it is the eye of the beholder.
    </p>
    <p>
      A language which uses fewer basic elements to achieve the
      same power is simpler.
    </p>
    <p>
      Sometimes simplicity is confused with 'easy to understand".
      For example, a two-line solution which uses recursion is a
      pretty simple, even though some people might find it easier
      to work though a 10-line solution which avoids recursion.
    </p>
    <p>
      In XML, "Processing Instructions", those things which start
      with "&lt;?" are <strong>not</strong> simple. They look
      simple, just an extra sort of thing in the language, but the
      complicate what was a very clean design of elements and
      attributes, and a complication in the underlying syntax is
      has great effect. All specifications which refer to XML
      processing will have to figure out what to do about
      processing instructions as well as elements.
    </p>
    <h2>
      <a name="Modular" id="Modular">Modular Design</a>
    </h2>
    <p>
      When you design a system, or a language, then if the features
      can be broken into relatively loosely bound groups of
      relatively closely bound features, then that division is a
      good thing to be made a part of the design. This is just good
      engineering. It means that when you want to change the
      system, you can with luck in the future change only one part,
      which will only require you to understand (and test) that
      part. This will allow other people to independently change
      other parts at the same time. This is just classic good
      software design and books have been written about it. The
      corollary, the TOII is less frequently met.
    </p>
    <p>
      Modular design hinges on the simplicity and abstract nature
      of the interface definition between the modules. A design in
      which the insides of each module need to know all about each
      other is not a modular design but an arbitrary partitioning
      of the bits.  <a href="Modularity.html">(More ...)</a>
    </p>
    <h2>
      <a name="Modular2" id="Modular2">Being part of a Modular Design</a>
    </h2>
    <p>
        Its is not only necessary to make sure your own system is designed
        to be made of modular parts.  It is also necessary to realize that
        your own system, no matter how big and wonderful it seems now,
        should always be designed to be a <em>part</em> of another
        larger system.
    </p>
    <p>This is often much more difficult than modularity.
    </p>
    <h2 id="Tolerance">
      Tolerance
    </h2>
    <blockquote>
      <p>
        "Be liberal in what you require but conservative in what
        you do"
      </p>
    </blockquote>
    <p>
      This is the expression of a principle which applies pretty
      well in life, (it is a typical UU tenet), and is commonly
      employed in design across the Internet.
    </p>
    <p>
      Write HTML 4.0-strict. Accept HTML-4.0-Transitional (a
      superset of strict).
    </p>
    <p>
      This principle can be contentious. When browsers are lax
      about what they expect, the system works better but also it
      encourages laxness on the part of web page writers. The
      principle of tolerance does not blunt the need for a
      perfectly clear protocol specification which draws a precise
      distinction between a conformance and non-conformance. The
      principle of tolerance is no excuse for a product which
      contravenes a standard.
    </p>
    <h2 id="Decentrali">
      Decentralization
    </h2>
    <p>
      This is a principle of the design of distributed systems,
      including societies. It points out that any single common
      point which is involved in any operation trends to limit the
      way the system scales, and produce a single point of complete
      failure.
    </p>
    <p>
      Centralization in social systems can apply to concepts, too.
      For example, if we make a knowledge representation system
      which requires anyone who uses the concept of "automobile" to
      use the term "http://www.kr.org/stds/industry/automobile"
      then we restrict the set of uses of the system to those for
      whom this particular formulation of what an automobile is
      works. The Semantic Web must avoid such conceptual
      bottlenecks just as the Internet avoids such network
      bottlenecks.
    </p>
    <h2>
      <a name="TOII" id="TOII">Test of Independent Invention</a>
    </h2>
    <blockquote>
      <p>
        If someone else had already invented your system, would
        theirs work with yours?
      </p>
    </blockquote>
    <p>
      Does this system have to be the only one of its kind? This
      simple thought test is described in more detail in "<a href="Evolution.html#TOII">Evolution</a>" in these Design Issues.
      It is connectted to modularity inside-out: designing a system not to be
      modular in itself, but to be a part of an as-yet unspecified
      larger system. A critical property here is that the system
      tries to do one thing well, and leaves other things to other
      modules. It also has to avoid conceptual or other
      centralization, as no two modules can claim the need to be
      the unique center of a larger system.
    </p>
    <h2>
      <a name="PLP" id="PLP">Principle of Least Power</a>
    </h2>
    <p>
      In choosing computer languages, there are classes of program
      which range from the plainly descriptive (such as Dublin Core
      metadata, or the content of most databases, or HTML) though
      logical languages of limited power (such as access control
      lists, or <em>conneg</em> content negotiation) which include
      limited propositional logic, though declarative languages
      which verge on the Turing Complete (Postscript is, but PDF
      isn't, I am told) through those which are in fact Turing
      Complete though one is led not to use them that way (XSLT,
      SQL) to those which are unashamedly procedural (Java, C).
    </p>
    <p>
      The choice of language is a common design choice. The low
      power end of the scale is typically simpler to design,
      implement and use, but the high power end of the scale has
      all the attraction of being an open-ended hook into which
      anything can be placed: a door to uses bounded only by the
      imagination of the programmer.
    </p>
    <p>
      Computer Science in the 1960s to 80s spent a lot of effort
      making languages which were as powerful as possible. Nowadays
      we have to appreciate the reasons for picking not the most
      powerful solution but the least powerful. The reason for this
      is that the less powerful the language, the more you can do
      with the data stored in that language. If you write it in a
      simple declarative from, anyone can write a program to
      analyze it in many ways. The Semantic Web is an attempt,
      largely, to map large quantities of existing data onto a
      common language so that the data can be analyzed in ways
      never dreamed of by its creators. If, for example, a web page
      with weather data has RDF describing that data, a user can
      retrieve it as a table, perhaps average it, plot it, deduce
      things from it in combination with other information. At the
      other end of the scale is the weather information portrayed
      by the cunning Java applet. While this might allow a very
      cool user interface, it cannot be analyzed at all. The search
      engine finding the page will have no idea of what the data is
      or what it is about. This the only way to find out what a
      Java applet means is to set it running in front of a person.
    </p>
    <p>
      I hope that is a good enough explanation of this principle.
      There are millions of examples of the choice. I chose HTML
      not to be a programming language because I wanted different
      programs to do different things with it: present it
      differently, extract tables of contents, index it, and so on.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 05 Nov 2023 00:00:00 GMT</pubDate>
  <title>Inference over Private Data</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/PrivateData.html</link>
    <guid>https://www.w3.org/DesignIssues/PrivateData.html</guid>
      <description><![CDATA[
  <p>
  Ah, the holy grail of AI! Private data pods are a really hot topic
  right now, and for good reason - they offer the potential
  to unlock the power of AI while keeping private data safe and secure.
  By allowing people to keep their data in a secure, private "pod" and
  then only granting access to AI algorithms when needed,
  it could create a whole new world of possibilities for personalized
  services and applications while still protecting people's privacy.
  Imagine being able to get personalized health advice,
  financial planning, or even just customized entertainment
  recommendations without having to worry about your data being misused or
  sold to third parties! - pi.ai
</p>


<p><a href="https://www.w3.org/DesignIssues/PrivateData.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 05 Nov 2017 00:00:00 GMT</pubDate>
  <title>When use a literal for a URI in Linked Data</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/QuotingURIs.html</link>
    <guid>https://www.w3.org/DesignIssues/QuotingURIs.html</guid>
      <description><![CDATA[

  <h1>When use a literal for a URI in Linked Data</h1>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Sep 1998 00:00:00 GMT</pubDate>
  <title>RDF and Relational databases</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/RDB-RDF.html</link>
    <guid>https://www.w3.org/DesignIssues/RDB-RDF.html</guid>
      <description><![CDATA[
    <h1>
      Relational Databases on the Semantic Web
    </h1>
    <p>
      There are many other data models which RDF's Directed
      Labelled Graph (DLG) model compares closely with, and maps
      onto. See a summary in
    </p>
    <ul>
      <li>
        <a href="RDFnot.html">What the Semantic Web can
        represent</a>
      </li>
    </ul>
    <p>
      One is the Relational Database (RDB) model.
    </p>
    <h2>
      <a name="ER" id="ER">The Semantic Web and Entity-Relationship
      models</a>
    </h2>
    <p>
      Is the RDF model an entity-relationship mode? Yes and no. It
      is great as a basis for ER-modelling, but because RDF is used
      for other things as well, RDF is more general. RDF is a model
      of entities (nodes) and relationships. If you are used to the
      "ER" modelling system for data, then the RDF model is
      basically an openning of the ER model to work on the Web. In
      typical ER model involved entity types, and for each entity
      type there are a set of relationships (slots in the typical
      ER diagram). The RDF model is the same, except that
      relationships are first class objects: they are identified by
      a URI, and so anyone can make one. Furthurmore, the set of
      slots of an object is not defined when the class of an object
      is defined. The Web works though anyone being (technically)
      allowed to say anything about anything. This means that a
      relationship between two objects may be stored apart from any
      other information about the two objects. This is different
      from object-oriented systems often used to implement ER
      models, which generally assume that information about an
      object is stored in an object: the definition of the class of
      an object defines the storage implied for its properties.
    </p>
    <p>
      For example, one person may define a vehicle as having a
      number of wheels and a weight and a length, but not foresee a
      color. This will not stop another person making the assertion
      that a given car is red, using the color vocabular from
      elsewhere.
    </p>
    <p>
      Apart from this simple but significant change, many concepts
      involved in the ER modelling take across directly onto the
      Semantic Web model.
    </p>
    <h2>
      The Semantic Web and Relational Databases
    </h2>
    <p>
      The semantic web data model is very directly connected with
      the model of relational databases. A relational database
      consists of tables, which consists of rows, or records. Each
      record consists of a set of fields. The record is nothing but
      the content of its fields, just as an RDF node is nothing but
      the connections: the property values. The mapping is very
      direct
    </p>
    <ul>
      <li>a record is an RDF node;
      </li>
      <li>the field (column) name is RDF propertyType; and
      </li>
      <li>the record field (table cell) is a value.
      </li>
    </ul>
    <p>
      Indeed, one of the main driving forces for the Semantic web,
      has always been the expression, on the Web, of the vast
      amount of relational database information in a way that can
      be processsed by machines.
    </p>
    <p>
      RDF's serialization format -- its syntax in XML -- is a very
      suitable format for expressing relational database
      information.
    </p>
    <h3>
      Special aspects of the RDB model
    </h3>
    <p>
      Relational database systems manage RDF data, but in a
      specialized way. In a table, there are many records with the
      same set of properties. An individual cell (which corresponds
      to an RDF property) is not often thought of on its own. SQL
      queries can join tables and extract data from tables, and the
      result is generally a table. So, the practical use for which
      RDB software is used typically optimized for doing operations
      with a small number of tables some of which may have a large
      number of elements.
    </p>
    <p>
      A fundamental aspect of a database table is that often the
      data in a table can be definitive. Neither RDF nor RDB models
      have simple ways of expressing this. For example, not only
      does a row in a table indicate that there is a red car whose
      Massachusetts plate is "123XYZ", but the table may also carry
      the unwritten semantics that if any car has a Massachusetts
      plate then it must be in the table. (If any RDF node has
      "Massachusetts plate number" property then than node is a
      member of the table) The scope of the uniquenes of a value is
      in fact a very interest property.
    </p>
    <p>
      The original RDB model defined by E.F. Codd included
      datatyping with inheritance, which he had intended would be
      implememnted in the RDB products to a greater extent that it
      has. For example, typically a person's home address house
      number may be typed as an an integer, and their shoe size may
      also be also be typed as an integer. One can as a result join
      to tables through those fields, or list people whose shoe
      size equals their house number. Practical RDB systems leave
      it to the application builder to only make operations which
      make sense. Once a database is expreted onto the Web, it
      becomes possible to do all kinds of strange combinations, so
      a stronger typing becomes very useful: it becomes a set of
      inference rules.
    </p>
    <p>
      In a pure RDB model, every table has a primary key: a column
      whose value can be used to uniquely identify every row. Some
      products do not enforce this, leading to an ambiguity in the
      significance of duplicate rows. A curious feature is that the
      primary key can be changed without changing the identity of a
      row. (A person can change their name for example). SQL allows
      tables to be set up so that such changes can cascade through
      the local system to preseve referential integrity. This
      clearly won't work on the Web. One solution is to use a row
      ID -- which many systems do in fact use although SQL doesn't
      expose it in a standard way. Another is for the application
      to coinstrain the primary key not to change. Another is to
      put up with links breaking.
    </p>
    <p>
      RDB systems have datatypes at the atomic (unstructured)
      level, as RDF and XML will/do. Combination rules tend in RDBs
      to be loosely enforced, in that a query can join tables by
      any columns which match by datatype -- without any check on
      the semantics. You could for example create a list of houses
      that have the same number as rooms as an employee's shoe
      size, for every employee, even though the sense of that would
      be questionable.
    </p>
    <p>
      The new SQL99 standard is going to include new
      object-oriented features, such as inherited typing and
      structured contents of cells - arrays and structs. This RDB
      model with things from the OO world. I don't deal with that
      here in that the RDF model works as a lowest commoin
      denominator being able to express either and both.
    </p>
    <h3>
      Schemas and Schemas
    </h3>
    <p>
      A difference between XML/RDF schemas (and SGML) on the one
      hand and database schemas on the other is the expectation
      that there will be a relatively small number of XML/RDF
      schemas. Many web sites will export documents whose structure
      is defined by the same schema, and this is in fact what
      provides the interoperability.
    </p>
    <p>
      A database schema is, as fasr as I know, created
      independently for each database. Even if a million companies
      clone the same form of employee database, there will be a
      million schemas, one for each database.
    </p>
    <p>
      It may be that RDF will fill a simple role in simply
      expressing the equivalence of the terms in each database
      schema.
    </p>
    <h3>
      Exposing a database on the Web
    </h3>
    <p>
      In order to be able to access a table, and make extra
      statements about it which will enable its use in more and
      more ways, the essential objects of the table must be
      exported as first class objects on the Web.
    </p>
    <p>
      When mapping any system onto the Web, the mapping into URI
      space is critical. Here we are doing this common operation
      generically for all relational databases. It is obviously
      usefuil for this to be done in a consistent ways between
      multiple vendors would be useful - an area for possible
      standardization.
    </p>
    <p>
      Here is a random example I may have gotten wrong, basd on
      whatI understand of the naming within databases. The database
      itself is defined within a schema which is listed in a
      catalog.
    </p>
    <table border="1">
      <caption>
        Mapping an RDB into the Web - strawman
      </caption>
      <tbody>
        <tr>
          <td>
            Catalog
          </td>
          <td>
            http://www.acme.com/mycat
          </td>
          <td></td>
        </tr>
        <tr>
          <td>
            Schema
          </td>
          <td>
            http://www.acme.com/mycat/schema1
          </td>
          <td></td>
        </tr>
        <tr>
          <td>
            Database
          </td>
          <td>
            http://www.acme.com/mycat/schema1/empdb/
          </td>
          <th>
            Relative:
          </th>
        </tr>
        <tr>
          <td>
            Table
          </td>
          <td>
            /mycat/schema1/empdb/emps
          </td>
          <td>
            emps
          </td>
        </tr>
        <tr>
          <td>
            Column name
          </td>
          <td>
            /mycat/schema1/empdb/emps/shoe
          </td>
          <td>
            emps/shoe
          </td>
        </tr>
        <tr>
          <td>
            View
          </td>
          <td>
            /mycat/schema1/empdb/emps2
          </td>
          <td>
            emps2
          </td>
        </tr>
        <tr>
          <td>
            Row
          </td>
          <td>
            /mycat/schema1/empdb/emps/rowid=123
          </td>
          <td>
            emps/rowid=123
          </td>
        </tr>
        <tr>
          <td>
            Cell
          </td>
          <td>
            /mycat/schema1/empdb/emps/rowid=123;col=shoe
          </td>
          <td>
            emps/rowid=123;col=shoe
          </td>
        </tr>
        <tr>
          <td>
            Arbitrary query
          </td>
          <td>
            /mycat/schema1/empdb/?select+empno+from<em>[...]</em>
          </td>
          <td>
            ?select<em>[...]</em>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      2002 version, see <a href="http://www.w3.org/2000/10/swap/dbork/dbview.py">real
      code</a> implemented by Dan Connolly:
    </p>
    <table border="1">
      <tbody>
        <tr>
          <th>
            <a name="table" id="table">What</a>
          </th>
          <th>
            Uriref relative to http://www.acme.com/wherever/
          </th>
          <th>
            rdf:type
          </th>
        </tr>
        <tr class="work">
          <td>
            <p>
              Database description of database "personnel"
            </p>
          </td>
          <td>
            personnel
            <p>
              (say - whatever)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument, db:DatabaseDescription
          </td>
        </tr>
        <tr>
          <td>
            The conceptual database(a table of tables??)
          </td>
          <td>
            personnel#_database
            <p>
              (Arbitrary, must not clash, linked by
              <code><strong>db:describes</strong></code> from
              personnel)
            </p>
          </td>
          <td></td>
        </tr>
        <tr class="work">
          <td>
            A document giving all the data in the database. May
            support PUT?
          </td>
          <td>
            personnel/_data
            <p>
              (Arbitrary, must not clash with table names, linked
              by <strong><code>db:allData</code></strong> from
              personnel)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument
          </td>
        </tr>
        <tr>
          <td>
            The concept of the table "employees": The class of
            exactly those things which are in the table.
          </td>
          <td>
            <p>
              personnel/employees#.table
            </p>
            <p>
              (was: personnel#employees, but changed to allow it to
              be deref'd to giev useful data)
            </p>
            <p>
              (defined in personnel)
            </p>
          </td>
          <td>
            rdfs:Class, db:Table
          </td>
        </tr>
        <tr class="work">
          <td>
            A description of the table. Optimization: includes the
            current size of the table. Identifies primary key if
            any.
          </td>
          <td>
            personnel/employees
            <p>
              (<strong>Convention</strong>. The bit of the
              classname before the #)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument, db:TableDescription
          </td>
        </tr>
        <tr class="work">
          <td>
            A description of all the tables. Just an (optional)
            optimization.
          </td>
          <td>
            personnel/_all
            <p>
              (Arbitrary, must not clash, linked by
              <code><strong>db:tableSchemas</strong></code> from
              personnel/employees)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument, db:TableDescription
          </td>
        </tr>
        <tr>
          <td>
            The concept of a column in the table, the Property
            something has iff that is recorded in the table.
          </td>
          <td>
            personnel/employees#email
            <p>
              (Defined in personnel/employees)
            </p>
          </td>
          <td>
            rdf:Property, db:Column
          </td>
        </tr>
        <tr class="work">
          <td>
            A document giving all the data in the table. May
            support PUT
          </td>
          <td>
            personnel/employees/_data
            <p>
              (Arbitrary, must not clash, linked by
              <strong><code>db:tableData</code></strong> from
              personnel/employees)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument,
          </td>
        </tr>
        <tr class="work">
          <td>
            A document giving the data in the row for which the
            primary key is 1234. (Iff primary key exists). May
            support PUT
          </td>
          <td>
            personnel/employees/1234
            <p>
              (<strong>Convention.</strong> Note the primary key
              value must be encoded suitably!)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument
          </td>
        </tr>
        <tr>
          <td>
            The concept of the thing describd by that row.
          </td>
          <td>
            <p>
              personnel/employees/1234#item
            </p>
            <p>
              (<strong>Convention</strong>)
            </p>
            <p>
              (when primary key exists, then employees#_data etc
              use this URIref for the item 1234 intead of making
              anonymous nodes)
            </p>
            <p>
              (employees/_data#1234?@@)
            </p>
          </td>
          <td>
            personnel/employees#_Class
          </td>
        </tr>
        <tr class="work">
          <td>
            A document giving the information in just one cell
          </td>
          <td>
            personnel/employees/1234/email
            <p>
              (<strong>Convention</strong>)
            </p>
          </td>
          <td>
            [ is rdf:domain of personnel/employees#email ]
          </td>
        </tr>
        <tr class="work">
          <td>
            Arbitrary query
          </td>
          <td>
            personnel/_sql?select+empno+from<em>[...]</em>
            <p>
              (arbitrary, linked by
              <code><strong>db:sqlService</strong></code> from
              personnel if supported.)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument
          </td>
        </tr>
        <tr class="work">
          <td>
            Arbirary HTML form field match (select * from employees
            where email like "*fred*") [@details]
          </td>
          <td>
            personnel/_fquery?email=*fred*;name=Joe
            <p>
              (arbitrary, linked by
              <code><strong>db:formService</strong></code> from
              personnel if supported)
            </p>
          </td>
          <td>
            soc:Work, rdfdocument
          </td>
        </tr>
        <tr>
          <td>
            POST point for RDF data, either new data, or assertions
            that some (n3) Formula is a log:Falsehood.
          </td>
          <td>
            <p>
              personnel/_postme
            </p>
            <p>
              (arbitrary, linked by
              <code><strong>db:deltaService</strong></code> from
              personnel if supported. Could be same URI
              <code>personnel</code> in fact, as we are dealing
              iwth a different method)
            </p>
          </td>
          <td>
            db:postable
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      @@@ How to use typing to indicate that the URI in the table
      is a (relative?) URI to another object, not a string?
    </p>
    <p>
      @@@ This works fine when implemented live on a database.
      However, it is a little tricky to emulate in a typical
      file-based web server because of the use of "personnel" in
      this case both as directory and as
    </p>
    <p>
      One of the things which makes life easier is to make the
      mapping so that the relative URI syntax can be used to
      advantage. For example, here, everything within the database
      (the scope of an SQL statement) can be writted as a short
      URI.
    </p>
    <p>
      There is a question as to how much of the SQL query syntax
      should be turned into identifier. For example, is a query on
      a primary key really an identifier? Is the extraction of a
      single cell really an identifier? It would be useful to be
      able to treat them as such. However, it would be wiser to use
      the "?" convention to indicate a generalized SQL idempotent
      query. (A URL should <a href="Axioms.html#get">of course</a>
      <em>never</em> be used to refer to the results of a
      table-changing operation such as UPDATE or DELETE. In this
      case, if HTTP were used, an SQL query should IMHO be POST ed
      to the database URI. Of course, you can use your favorite
      networked database access protocol)
    </p>
    <p>
      In the above the column name of the table could be refered to
      using the table as a namespace, a row for example being
    </p>
    <pre>&lt;foo<br>  xmlns:t="http://www.example.com/mycat/personnel/employees"&gt;<br>  &lt;t:email&gt;joe@example.com&lt;/t:email&gt;<br>  &lt;t:age&gt;45&lt;/t:age&gt;<br>&lt;/foo&gt;
</pre>
    <p>
      and one row of the the result of joining this table (of
      people) and another table (about people) by their primary
      keys would use namespaces from both tables:
    </p>
    <pre>&lt;foo<br>  xmlns:t="http://www.example.com/mycat/personnel/employees"<br>  xmlns:u="http://www.acme.com/mycat/schema1/empdb/likes"&gt;<br>    &lt;t:email&gt;joe@example.com&lt;/t:email&gt;<br>  &lt;t:age&gt;45&lt;/t:age&gt;<br>  &lt;u:music&gt;blues&lt;/u:music&gt;<br>&lt;/foo&gt;
</pre>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2000 00:00:00 GMT</pubDate>
  <title>Changes needed to the RDF core in 2010</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/RDF-Future.html</link>
    <guid>https://www.w3.org/DesignIssues/RDF-Future.html</guid>
      <description><![CDATA[Today (2010-06-26) and tomorrow, and workshop
is about to take place about the future of RDF. Whilst the
scheduling of the workshop did not allow me to participate
directly, I summarize here my probably well-known views about some
aspects of the future of RDF at the bottom level.
This does not address the many other exciting things which need
to be done in the semantic web, but just the RDF model and its
common serializations.
<p><a href="https://www.w3.org/DesignIssues/RDF-Future.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 01 Jan 1999 00:00:00 GMT</pubDate>
  <title>Why RDF model is not exactly the XML model</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/RDF-XML.html</link>
    <guid>https://www.w3.org/DesignIssues/RDF-XML.html</guid>
      <description><![CDATA[
    <h1>
      Why RDF model is different from the XML model
    </h1>
    <p>
      This note is an attempt to answer the question, "Why should I
      use RDF - why not just XML?". This has been a question which
      has been around ever since RDF started. At the W3C Query
      Language workshop, there was a clear difference of view
      between those who wanted to query documents and those who
      wanted to extract the "meaning" in some form and query that.
      This is typical. I wrote this note in a frustrated attempt to
      explain whatthe RDF model was for those who though in terms
      of the XML model. I later listened to those who thought in
      terms of the XML model, and tried to writ it the other way
      around in <a href="XML-Semantics.html">another note</a>. This
      note assumes that the XML data model in all its complexity,
      and the RDF syntax as in RDF Model and Syntax, in all its
      complexity. It doesn't try to map one directly onto the other
      -- it expresses the RDF model using XML.
    </p>
    <p>
      Let me take as an example a single RDF assertion. Let's try
      "The author of the <i>page</i> is <i>Ora</i>". This is
      traditional. In RDF this is a triple
    </p>
    <pre>triple(author, page, Ora)
</pre>
    <p>
      which you can think of as represented by the diagram
    </p>
    <p align="center">
      <img src="diagrams/aac.gif" width="265" height="73" alt="page ---has author---> Ora" border="0">
    </p>
    <p>
      How would this information be typically be represented in
      XML?
    </p>
    <pre>&lt;author&gt;
     &lt;uri&gt;page&lt;/uri&gt;
     &lt;name&gt;Ora&lt;/name&gt;
&lt;/author&gt;
</pre>
    <p>
      or maybe
    </p>
    <pre>&lt;document href="page"&gt;
   &lt;author&gt;Ora&lt;/author&gt;
&lt;/document&gt;
</pre>
    <p>
      or maybe
    </p>
    <pre>&lt;document&gt;
   &lt;details&gt;
    &lt;uri&gt;href="page"&lt;/uri&gt;
    &lt;author&gt;
        &lt;name&gt;Ora&lt;/name&gt;
    &lt;/author&gt;
    &lt;/details&gt;
&lt;/document&gt;
</pre>
    <p>
      or maybe
    </p>
    <pre>&lt;document&gt;
   &lt;author&gt;
    &lt;uri&gt;href="page"&lt;/uri&gt;
    &lt;details&gt;
        &lt;name&gt;Ora&lt;/name&gt;
    &lt;/details&gt;
    &lt;/author&gt;
&lt;/document&gt;

&lt;document href="http://www.w3.org/test/page" author="Ora" /&gt;
</pre>
    <h2>
      The XML Graph
    </h2>
    <p>
      These are all perfectly good XML documents - and to a person
      reading then they mean the same thing. To a machine parsing
      them, they produce different XML trees. Suppose you look at
      the XML tree
    </p>
    <pre>&lt;v&gt;
   &lt;x&gt;
    &lt;y&gt; a="ppppp"&lt;/y&gt;
    &lt;z&gt;
        &lt;w&gt;qqqqq&lt;/w&gt;
    &lt;/z&gt;
   &lt;/x&gt;
&lt;/v&gt;
</pre>
    <p>
      It's not so obvious what to make of it. The element names
      were a big hint for a human reader.
    </p>
    <p>
      <b>Without looking at the schema</b>, you know things about
      the document structure, but nothing else. You can't tell what
      to deduce. You don't know whether <i>ppppp</i> is a <i>y</i>
      of <i>qqqqq</i>, or <i>qqqqq</i> is a <i>z</i> of
      <i>ppppp</i> or what. You can't even really tell what real
      questions can be asked. A source of some confusion is that in
      the xyz example above, there are lots of questions you
      <i>can</i> ask. They are questions like,
    </p>
    <ul>
      <li>Is there a w element within a details element?
      </li>
      <li>What is the content of the w element within the first x
      element?
      </li>
      <li>What is the content of the w element following the first
      y element which contains an x element whose a attribute is
      "pppp"?
      </li>
      <li>and so on.
      </li>
    </ul>
    <p>
      These are all questions about the <i>document</i>. If you
      know the document schema (a big <i>if</i>) , and if that
      schema it only gives you a limited number of ways of
      expressing the same thing (another big <i>if</i>) , then
      asking these questions can be in fact equivalent to asking
      questions like
    </p>
    <ul>
      <li>What is the author of <i>page</i>?
      </li>
    </ul>
    <p>
      This is hairy. It is possible because there is a mapping from
      XML documents to semantic graphs. In brief, it is hairy
      because
    </p>
    <ul>
      <li>The mapping is many to one
      </li>
      <li>You need a schema to know what the mapping is
      </li>
      <li>(The schemas we are talking about for XML at the moment
      do not include that anyway and would have to have a whole
      inference language added)
      </li>
      <li>The expression you need for querying something in terms
      of the XML tree is necessarily more complicated than the
      expression you need for querying something in terms of the
      RDF tree.
      </li>
    </ul>
    <p>
      This last is a big one. If you try to write down the
      expression for the author of a document where the information
      is in some arbitrary XML schema, you can probably do it
      though it may or may not be very pretty. If you try to
      combine more than one property into a combined expression,
      (give me a list of books by the same author as this one),
      saying it in XML gets too clumsy to consider.
    </p>
    <p>
      (Think of trying to define the addition of numbers by regular
      expression operations on the strings. Its possible for
      addition. When you get to multiplication it gets ridiculous -
      to solve the problem you would end up reinventing numbers as
      a separate type.)
    </p>
    <p>
      Looking at the simple XML encoding above,
    </p>
    <pre>&lt;author&gt;
     &lt;uri&gt;page&lt;/uri&gt;
     &lt;name&gt;Ora&lt;/name&gt;
&lt;/author&gt;
</pre>
    <p>
      it could be represented as a graph
    </p>
    <p>
      <img src="diagrams/xml1.gif" alt="A graph of the XML tree with 3 element nodes each with name and some with content" width="" height="0">
    </p>
    <p>
      We can represent the tree more concisely if we make a
      shorthand by writing the name of each element inside its
      circle:
    </p>
    <p>
      <img src="diagrams/aab.gif" width="" height="0">
    </p>
    <p>
      Of course the RDF tree which this represents (although it
      isn't obvious from the XML tree except to those who know) is
    </p>
    <p align="center">
      <img src="diagrams/aac.gif" width="265" height="73" alt="page ---has author---> Ora" border="0">
    </p>
    <p>
      Here we have made a shorthand again by putting making the
      label for each part its URI.
    </p>
    <p>
      The complexity of querying the XML tree is because there are
      in general a large number of ways in which the XML maps onto
      the logical tree, and the query you write has to be
      independent of the choice of them. So much of the query is an
      attempt to basically convert the set of all possible
      representations of a fact into one statement. This is just
      what RDF does. It gives you some standard ways of writing
      statements so that however it occurs in a document, they
      produce the same effect in RDF terms. The same RDF tree
      results from many XML trees.
    </p>
    <p>
      Wouldn't it be nice if we could label our XML so that when
      the parser read it, it could find the assertions (triples)
      and distinguish their subjects and objects, so as to just
      deduce the logical assertions without needing RDF? This is
      just what RDF does, though.
    </p>
    <h2>
      The RDF Graph
    </h2>
    <p>
      In fact RDF is very flexible - it can represent this triple
      in many ways in XML so as to be able to fit in with
      particular applications, but just to pick one way, you could
      write the above as
    </p>
    <pre>&lt;Description about="http://www.w3.org/test/page" Author ="Ora" /&gt;
</pre>
    <p>
      I have missed out the stuff about namespaces. In fact as
      anyone can create or own the verbs, subjects and objects in a
      distributed Web, any term has to be identified by a URI
      somehow. This actual real example works out to in real life
      more like
    </p>
    <pre>&lt;?xml version="1.0"?&gt;
  &lt;Description

          xmlns="http://www.w3.org/TR/WD-rdf-syntax#"
          xmlns:s="http://docs.r.us.com/bibliography-info/"
 
                  about="http://www.w3.org/test/page" 
                  s:Author ="http://www.w3.org/staff/Ora" /&gt;
</pre>
    <p>
      You can think that the "description" RDF element gives the
      clue to the parser as to how to find the subjects, objects
      and verbs in what follows.
    </p>
    <p>
      This is pretty much the most shorthand way of using the base
      RDF in XML. There are others which are longer, but more
      efficient when you have, for instance, sets of many
      properties of the same object. The useful thing is that of
      course they all convey the same triple
    </p>
    <p align="center">
      <img src="diagrams/aac.gif" width="265" height="73" alt="page ---has author---> Ora" border="0">
    </p>
    <p>
      It is a mess when you use questions about a document to try
      to ask questions about what the document is trying to convey.
      It will work. In a way. But flagging the grammar explicitly
      (RDF syntax is a way of doing this) is a whole lot better.
    </p>
    <p>
      Things you can do with RDF which you can't do with XML
      include
    </p>
    <ul>
      <li>You can parse the semantic tree, which end up giving you
      a set of (possibly mutually referential) triples and then you
      can use the ones you want ignoring the ones you don't
      understand.
      </li>
    </ul>
    <p>
      Problems with basing you understanding on the structure
      include
    </p>
    <ul>
      <li>Without having gone to the trouble of getting the schema,
      or having an application hand-programmed to recognise a
      particular document type, you can't pick up any semantic
      information from a document;
      </li>
      <li>When an XML schema changes, it could typically introduce
      new intermediate elements (like "details" in the tree above
      or "div" is HTML). These may or may or may not invalidate any
      query which has been based on the structure of the document.
      </li>
      <li>If you haven't gone to the trouble of making a semantic
      model, then you may not have a well defined one.
      </li>
    </ul>
    <p>
      I'll end this with some examples of the last problem. Clearly
      they can be avoided by good design even in an XML system
      which does not use RDF. Using RDF makes things easier.
    </p>
    <h2>
      Get it right
    </h2>
    <p>
      If you haven't gone to the trouble of making a semantic
      model, then you may not have a well defined one. What does
      that mean? I can give some general examples of ambiguities
      which crop up in practice. In RDF, you need a good idea about
      what is being said about what, and they would tend not to
      arise.
    </p>
    <p>
      Look at a label on the jam jar which says: "Expires 1999".
      What expires: the label, or the jam? Here the ambiguity is
      between a statement about a statement about a document, and a
      statement about a document.
    </p>
    <p>
      Another example is an element which qualifies another
      apparently element. When information is assembled in a set of
      independently thrown in records often ambiguities can arise
      because of the lack of logic. HTTP headers (or email headers)
      are a good example. These things can work when one program
      handles all the records, but when you start mixing records
      you get trouble. In XML it is all too easy to fall into the
      trap of having two elements, one describing the author, and a
      separate one as a flag that the "author" element in fact
      means not the direct author but that of a work translated to
      make the book in question. Suddenly, the "author" tag, which
      used to allow you to conclude that the author of a finnish
      document must speak finnish, now can be invalidated by an
      element somewhere else on the record.
    </p>
    <p>
      Another symptom of a specification where the actual semantics
      may not be as obvious as as first sight is ordering. When we
      hear that the order of a set of records is important, but the
      records seem to be defined independently, how can that be?
      Independent assertions are always valid taken individually or
      in any order. In a server configuration file, for example, he
      statement which looks like "any member has access to the
      page" might really mean "any member has access to the page
      unless there is no other rule in this file which has matched
      the page". That isn't what the spec said, but it did mention
      that the rules were processed in order until one applied.
      Represented logically, in fact there is a large nested
      conditional. There is implicit ordering when mail headers
      say, "this message is encrypted", "this message is
      compressed", "this message is ASCII encoded", "this message
      is in HTML". In fact the message is an ASCII encoded version
      of an encrypted version of a compressed version of a message
      in HTML. In email headers the logic of this has to be written
      into the fine print of the specification.
    </p>
    <h2>
      Order in documents
    </h2>
    <p>
      There is something fundamentally different between giving a
      machine a knowledge tree, and giving a person a document. A
      document for a person is generally serialized so that, when
      read serially by a human being, the result will be to build
      up a graph of associations in that person's head. The order
      is important.
    </p>
    <p>
      For a graph of knowledge, order is not important, so long as
      the nodes in common between different statements are
      identified consistently. (There are concepts of ordered lists
      which are important although in RDF they break down at the
      fine level of detail to an unordered set of statements like
      "The first element of L is x", the "third element of L is z",
      etc so order disappears at the lowest level.). In
      machine-readable documents a list of ostensibly independent
      statements where order is important often turn out to be
      statements which are by no means independent.
    </p>
    <p>
      Some people have been reluctant to consider using an RDF tree
      because they do not wish to give up the order, but my
      assumption is that this is from constraints on processing
      human readable documents. These documents are typically not
      ripe for RDF conversion anyway.
    </p>
    <p>
      Conclusion:
    </p>
    <p>
      Sometimes it seems there is a set of people for whom the
      semantic web is the only graph which they would consider, and
      another for whom the document tree (or graph if you include
      links) is all they would consider. But it is important to
      recognise the difference.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Sep 1998 00:00:00 GMT</pubDate>
  <title>What the semantic Web isn&apos;t but can represent</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/RDFnot.html</link>
    <guid>https://www.w3.org/DesignIssues/RDFnot.html</guid>
      <description><![CDATA[
    <p>
      Parenthetically, so as not to disturb the flow of what a
      semantic web <i>is</i>,...what it is not, and how other data
      models map into directed labelled graphs.
    </p>
    <h1>
      What the Semantic Web can represent
    </h1>
    <p>
      There are many other data models which RDF's Directed
      Labelled Graph (DLG) model compares closely with, and maps
      onto. This page is written with the intention of enumerating
      the similarity and diferences between the models, to indicate
      how the mapping might be done and what extra information
      muast be added in the process. Where the other models are
      related to previous unmet promises of computer science, now
      passed into folk law as unsolvable problems, they suggest a
      fear that the goal of a Semantic Web is inappropriate.
    </p>
    <p>
      One consistent difference between the Semantic Web and many
      data models for programming langauges is the "closed world
      assumption".
    </p>
    <h3>
      <a name="Semantic" id="Semantic">A Semantic Web is not
      Artificial Intelligence</a>
    </h3>
    <p>
      The concept of machine-understandable documents does not
      imply some magical artificial intelligence which allows
      machines to comprehend human mumblings. It only indicates a
      machine's ability to solve a well-defined problem by
      performing well-defined operations on existing well-defined
      data. Instead of asking machines to understand people's
      language, it involves asking people to make the extra effort
    </p>
    <p>
      Even though it simple to define, RDF at the level with the
      power of a semantic web will be complete language, capable of
      expressing paradox and tautology, and in which it will be
      possible to phrase questions whose answers would to a machine
      require a search of the entire web and an unimaginable amount
      of time to resolve. This should not deter us from making the
      language complete. Each mechanical RDF application will use a
      schema to restrict its use of RDF to a deliberately limited
      language. However, when links are made between the RDF webs,
      the result will be an expression of a huge amount of
      information. It is clear that because the Semantic Web must
      be able to include all kinds of data to represent the world,
      tha the language itself must be compeletely expressive
    </p>
    <h3>
      <a name="semantic2" id="semantic2">A semantic Web will not
      require every application to use expressions of arbitrary
      complexity</a>
    </h3>
    <p>
      Even though the language itself allows expressions of
      arbitrary complexity and computability, applications which
      generate RDF will in practice be limited to generating simple
      expressions such as access control lists, privacy
      preferences, and search criteria. This does not mean that
      where a "not" is needed, it should not be drawn from a
      standard vocabulary so than any RDF engine will be able to
      recognise it as a "not".
    </p>
    <p>
      (more)
    </p>
    <h3>
      <a name="semantic1" id="semantic1">A semantic Web will not
      require proof generation to be useful: proof validation will
      be enough.</a>
    </h3>
    <p>
      The first uses, such as access control on web sites, involve
      validation of a previously prepared proof, not a requirement
      to answer an arbitrary question, find the path the construct
      a valid proof. It is well known that to search for and
      generate a proof for an arbitrary question is typically an
      intractable process for many real world problems, and RDF
      does not require this (unsolvable) problem to be solved to be
      useful.
    </p>
    <h3>
      <a name="semantic" id="semantic">A semantic web is not an
      exact rerun of a previous failed experiment</a>
    </h3>
    <p>
      Other concerns at this point are raised about the
      relationship to Knowledge representation systems: has this
      not been tried before with projects such as <a href="Semantic.html#kif">KIF</a>and <a href="Semantic.html#cyc">cyc</a>? The answer is yes, it has, more
      or less, and such systems have been developed a long way.
      They should feed the semantic Web with design experience and
      the Semantic Web may provide a source of data for reasoning
      engines developed in similar projects.
    </p>
    <p>
      Many KR systems had a problem merging or interrelating two
      separate knowledge bases, as the model was that any concept
      had one and only one place in a tree of knowledge. They
      therefore did not scale, or pass the test of independent
      invention. [see evolvability]. The RDF world, by contrast is
      designed for this in mind, and the retrospective
      documentation of relationships between originally independent
      concepts.
    </p>
    <h3>
      <a name="Knowledge" id="Knowledge">Knowledge Representation
      goes Global</a>
    </h3>
    <p>
      Knowledge representation is a field which is currently seems
      to have the reputation of being initially interesting, but
      which did not seem to shake the world to the extent that some
      of its proponents hoped. It made sense but was of limited use
      on a small scale, but never made it to the large scale. This
      is exactly the state which the hypertext field was in before
      the Web. Each field had made certain centralist assumptions
      -- if not in the philosophy, then in the implementations,
      which prevented them from spreading globally. But each field
      was based on fundamentally sound ideas about the
      representation of knowledge. The Semantic Web is what we will
      get if we perform the same globalization process to Knowledge
      Representation that the Web initially did to Hypertext. We
      remove the centralized concepts of absolute truth, total
      knowledge, and total provability, and see what we can do with
      limited knowledge.
    </p>
    <h2>
      <a name="ER" id="ER">The Semantic Web and Entity-Relationship
      models</a>
    </h2>
    <p>
      Is the RDF model an entity-relationship mode? Yes and no. It
      is great as a basis for ER-modelling, but because RDF is used
      for other things as well, RDF is more general. RDF is a model
      of entities (nodes) and relationships. If you are used to the
      "ER" modelling system for data, then the RDF model is
      basically an opening of the ER model to work on the Web. In
      typical ER model involved entity types, and for each entity
      type there are a set of relationships (slots in the typical
      ER diagram). The RDF model is the same, except that
      relationships are first class objects: they are identified by
      a URI, and so anyone can make one. Furthurmore, the set of
      slots of an object is not defined when the class of an object
      is defined. The Web works though anyone being (technically)
      allowed to say anything about anything. This means that a
      relationship between two objects may be stored apart from any
      other information about the two objects. This is different
      from object-oriented systems often used to implement ER
      models, which generally assume that information about an
      object is stored in an object: the definition of the class of
      an object defines the storage implied for its properties.
    </p>
    <p>
      For example, one person may define a vehicle as having a
      number of wheels and a weight and a length, but not foresee a
      color. This will not stop another person making the assertion
      that a given car is red, using the color vocabulary from
      elsewhere.
    </p>
    <p>
      Apart from this simple but significant change, many concepts
      involved in the ER modelling take across directly onto the
      Semantic Web model.
    </p>
    <h2>
      <a name="Semantic1" id="Semantic1">The Semantic Web and
      Relational Databases</a>
    </h2>
    <p>
      The semantic web data model is very directly connected with
      the model of relational databases. A relational database
      consists of tables, which consists of rows, or records. Each
      record consists of a set of fields. The record is nothing but
      the content of its fields, just as an RDF node is nothing but
      the connections: the property values. The mapping is very
      direct
    </p>
    <ul>
      <li>a record is an RDF node;
      </li>
      <li>the field (column) name is RDF propertyType; and
      </li>
      <li>the record field (table cell) is a value.
      </li>
    </ul>
    <p>
      Indeed, one of the main driving forces for the Semantic web,
      has always been the expression, on the Web, of the vast
      amount of relational database information in a way that can
      be processsed by machines.
    </p>
    <p>
      RDF's serialization format -- its syntax in XML -- is a very
      suitable format for expressing relational database
      information.
    </p>
    <p>
      Relational database systems, manage RDF data, but in a
      specialized way. In a table, there are many records with the
      same set of properties. An individual cell (which corresponds
      to an RDF property) is not often thought of on its own. SQL
      queries can join tables and extract data from tables, and the
      result is generally a table. So, the practical use for which
      RDB software is used typically optimized for soing operations
      with a small number of tables some of which may have a large
      number of elements.
    </p>
    <p>
      RDB systems have datatypes at the atomic (unstructured)
      level, as RDF and XML will/do. Combination rules tend in RDBs
      to be loosely enforced, in that a query can join tables by
      any comlumns which match by datatype -- without any check on
      the semantics. You could for example create a list of houses
      that have the same number as rooms as an employee's shoe
      size, for every employee, even though the sense of that would
      be questionable.
    </p>
    <p>
      The Semantic Web is not designed just as a new data model -
      it is specifically appropriate to the linking of data of many
      different models. One of the great things it will allow is to
      add information relating different databases on the Web, to
      allow sophisticated operations to be performed across them.
    </p>
    <h2>
      <a name="Inference" id="Inference">RDF is not an Inference
      system</a>
    </h2>
    <p>
      I am not proposing any FPOC or HOL inference engine. I just
      note that HOL allows integration of multiple systems which
      use different inference engines spanning the range from from
      SQL to AI. For example, a simple HOL would allow any SHOE
      rules, data and results expressed, and a proof found by a
      SHOE engine to be verified by anyone.
    </p>
    <h3>
      <a name="Surely" id="Surely">Surely all first-order or
      higher-order predicate caluculus based systems (such as KIF)
      have failed historically to have wide impact?</a>
    </h3>
    <p>
      The same was true of hypertext systems between 1970 and 1990,
      ie before the Web. Indeed, the same objection was raised to
      the Web, and the same reasons apply for pressing on with the
      dream.
    </p>
    <p>
      The problem with all such systems was that they were
      conceptually or physically centralized. They required link
      global consistency.
    </p>
    <p>
      Guess what? KIF is very centralized in its approach to
      organizing knowledge (the cyc ontology for example suggests
      that everyone agree on the same terms for common english
      words, which RDF does not) and it does not promote its
      concepts to being first class web objects, ie it doesn't use
      URIs to identify them. To webize KIF or KR in general is, in
      many ways, the same as to webize hypertext in many ways.
      Replace identifiers with URIs. Remove any requirement for
      global consistency. Put in a significant effort into getting
      critical mass. Sit back.
    </p>
    <h3>
      Surely, many things expressible in FOPC are not efficiently
      computable?
    </h3>
    <p>
      Dead right. The goal of the semantic web is to express real
      life. Many things in real life, real questions which we will
      face are not efficiently computable. There are two solutions
      to this: The classical (pre-web) solution is to constrain the
      langage of expression so that all queries terminate in finite
      time. The weblike solution is to allow the expression of
      facts and rules in an overall language which is sufficiently
      flexible and powerful to express real life. Create subsets fo
      the web in which specific constraints give you specific
      computational properties. An anlogy is with the
      human-information systems which existed before the web. Most
      forced one to keep ones data in a hierarchy (sometimes of
      fixed depth or a matrix (often with a specific number of
      dimensions). This gave consistency properties within the
      information system. I bet DARPA has many of these systems and
      still does. They only way they could be integrated was to
      express them in terms of a much more powerful language -
      global hypertext. Hypertext did not have any of these
      reassuring properties. People were frightened about getting
      lost in it. You could follow links forever. As it turns out,
      it is true of course that there is a problem that you can
      follow links forever in the Web. And on the Semantic Web an
      inference engine will not necessarily terminate. However, on
      eth Web there are many subsystems such as many websites where
      life is very ordered and predictable, and searches give
      definitive results and there are no dangling links. But there
      is a HUGE advantage from exposing all this information in a
      way that allows it to be unified with all the other systems,
      ordered and unordered.
    </p>
    <h3>
      We should not expect a base inference level to include
      non-decidable computations
    </h3>
    <p>
      I have no expecatation of any inference capability in the SW
      core design. The semantic web does not have HOL inference as
      a standard. I would expect any SW compliant device to be able
      to <em>validate</em> a HOL proof, but not <em>generate</em>
      one.
    </p>
    <p>
      If you take a non-HOL-complete langauge and extend it to HOL,
      unless you have first defined where you are going (by
      defininbg the HOL langauge and expressing SHOE in it first)
      you will very likely end up with a rather baroque HOL
      langauge.
    </p>
    <h3>
      The FOPC inference model is extremely intolerant of
      inconsistency [i.e. P(x) &amp; NOT (P(X)) -&gt; Q], the
      semantic web has to tolerate many kinds of inconsistency.
    </h3>
    <p>
      Toleration of inconsistecy can only be done by fuzzy systems.
      We need a semantic web which will provide guarantees, and
      about which one can reson with logic. (A fuzzy system might
      be good for finding a proof -- but then it should be able to
      go back and justify each deduction logically to produce a
      proof in the unifying HOL language which anyone can check)
      Any real SW system will work not by believing anything it
      reads on the web but by checking the source of any
      information. (I wish people would learn to do this on the Web
      as it is!). So in fact, a rule will allow a system to infer
      things only from statements of a particular form signed by
      particular keys. Within such a system, an inconsistency is a
      serious problem, not something to worked around. If my bank
      says my bank balance is $100 and my computer says it is $200,
      then we need to figure out the problem. Same with launching
      missiles, IMHO. The semantic web model is that a URI
      dereferences to a document which parses to a directed labeled
      graph of statements. The statements can have URIs as
      prameters, so they can may statements about documents and
      about other statements. So you can express trust and reason
      about it, and limit your information to trusted consistent
      data.
    </p>
    <h3>
      Again, extension to higher order logic makes sense to me,
      requirement of FOPC inference model seems dangerous.
    </h3>
    <p>
      Most KR systems confuse information with inference tips. When
      a system stores a rule <em>a daughter of one's daughter is
      one's grandaughter</em> it is typically not just tored as
      that statement, but in a table of rules to be used by the
      algorithm at a particular time (for example whenever a parent
      of a daughter is found). The classicfication between data and
      various type of rule is a sort of meta level information
      which is general not itself expressed in the language. Two
      systems must be able to interchange the logical meaning of
      the rule, even when the type of rule may be unknown to each
      others inference engines. (Of couse, the rule expressed in
      general logic may be recongizable as a rule by another system
      and absorbed as such.) The example above is logically
    </p>
    <p>
      ∀α,β,χ (d(a,b) &amp; d(b,c) =&gt;
      gd(a,c))
    </p>
    <p>
      while for example a SHOE-based system and an Algernon-based
      system may have quite different systems for applying rules at
      different times.
    </p>
    <h2>
      <a name="CG" id="CG">Conceptual Graphs and the Semantic
      Web</a>
    </h2>I have written <a href="CG.html">a separate set of
    notes</a> about the relationship between Conceptual Graphs and
    the Semantic Web.
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 11 Oct 2009 00:00:00 GMT</pubDate>
  <title>Read-Write Linked Data</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/ReadWriteLinkedData.html</link>
    <guid>https://www.w3.org/DesignIssues/ReadWriteLinkedData.html</guid>
      <description><![CDATA[There is an architecture in which a few existing or Web
protocols are gathered together with some glue to make a world wide system in
which applications (desktop or Web Application) can work on top of a layer of
commodity read-write storage. The result is that storage becomes a commodity,
independent of the application running on it.
<p><a href="https://www.w3.org/DesignIssues/ReadWriteLinkedData.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Fri, 31 Dec 2004 00:00:00 GMT</pubDate>
  <title>Reification of RDF and N3</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Reify.html</link>
    <guid>https://www.w3.org/DesignIssues/Reify.html</guid>
      <description><![CDATA[
    <h1>
      Reifying RDF (properly), and N3
    </h1>
    <p>
      Reification in this context means the expression of something
      in a language using the language, so that it becomes
      treatable by the language. RDF graphs consist of RDF
      statements. If one wants to look objectively at an RDF graph
      and reason about it is using RDF tools, then it is useful, at
      least in theory, to have an ontology for describe RDF
      statements. This note described one suitable ontology.
    </p>
    <p>
      When RDF extended to N3, then one way of discussing the
      semantics is to describe N3 documents in RDF. This document
      does both.
    </p>
    <p>
      The namespace used is
      <code>&lt;http://www.w3.org/2004/06/rei#&gt;</code> , for
      which here we use the <code>rei:</code> prefix. Also, we use
      the ex<em>:</em> prefix for the namespace
      <code>&lt;http://example.com/ex#&gt;</code>.
    </p>
    <h2>
      RDF Terms
    </h2>
    <p>
      RDF terms are nodes in the RDF Graph. In RDF, these can be of
      three types: named nodes, blank nodes, and literals. We will
      also call named nodes <em>symbols</em>.
    </p>
    <h3>
      Symbols
    </h3>
    <p>
      Named nodes are named by URI strings, so a named node can be
      defined simply by its URI string. The symbol which in N3 is
      written as &lt;http://example.com/&gt; would be described as
      the RDF node:
    </p>
    <pre>[ a rei:Symbol;  rei:uri "http://example.com/ex#joe" ]
</pre>
    <h3>
      Blank nodes
    </h3>
    <p>
      Blank nodes (or Bnodes for short) are nodes do not have URIs.
      When describing a graph, we can say that a node is blank by
      saying that it is in the class rei:BNode.
    </p>
    <pre>[ a rei:Bnode ]
</pre>
    <p>
      This blank node in the description is a description of a
      blank node in the original graph. They are node the same
      blank node. We could in fact name the blank node for the
      purposes of description:
    </p>
    <pre>ex:bnode1 a rei:BNode.
</pre>
    <h3>
      Literals
    </h3>
    <p>
      Literals in an RDF graph are defined only by their value,
      just as symbols are defined by their URIs. When using RDF to
      describe RDF, RDF literals can clearly be used to give the
      value:
    </p>
    <pre>[ a rei:Literal, rei:value "The quick brown fox"]
</pre>
    <p>
      In fact, the domain of rei:value is rei:Literal, so it is not
      necessary to explicitly state that something is a literal,
      one can just write:
    </p>
    <pre>[rei:value "The quick brown fox"]
</pre>
    <h2>
      RDF Statements
    </h2>
    <p>
      A RDF statement is defined by its three parts, known as
      subject, predicate and object, each of which is a term. In
      RDF, neither the subject nor the predicate may be a Literal.
      The statement which in N3 is <code>ex:joe ex:name "James
      Doe".</code> would be described as
    </p>
    <pre>[ a rei:Statement;
  rei:subject [rei:uri "http://example.com/ex#joe"];
  rei:predicate [rei:uri "http://example.com/ex#name"];
  rei:object [rei:value "James Doe"]
] 
</pre>
    <p>
      In fact, the fact that it is a rei:Statement would have been
      clear as the domains of rei:subject, rei:predicate and
      rei:object are all rei:Statement.
    </p>
    <h2>
      RDF Graphs
    </h2>
    <p>
      An RDF graph is a set of statements. RDF itself doesn't have
      the concept of a set, it only has the concept of an ordered
      list (RDF collection). However, the OWL relation owl:oneOf
      related a class to a list of its members, and so we can form
      a set the set containing 3 4 and 5 as <code>[ owl:oneOf (3 4
      5)]</code> . using this convention, we can describe an RDF
      Graph as the set of statements. For example, the graph whose
      contents which would be written, in N3 as
    </p>
    <pre>ex:joe ex:name "James Doe".
ex:jane ex:name "Jane Doe".
</pre>
    <p>
      would be described in this ontology as:
    </p>
    <pre>{  a rei:RDFGraph;
   statements [ owl:oneof (
      [ a rei:Statement;
  rei:subject [rei:uri "http://example.com/ex#joe"];
  rei:predicate [rei:uri "http://example.com/ex#name"];
  rei:object [rei:value "James Doe"]
      ]
      [ a rei:Statement;
  rei:subject [rei:uri "http://example.com/ex#jane"];
  rei:predicate [rei:uri "http://example.com/ex#name"];
  rei:object [rei:value "Jane Doe"]
      ] ) 
</pre>
    <p>
      Using the set may be ungainly, but it ensures that two
      RDFGraphs which contain the same statements are demonstrably
      the same in their reified form. (We envisage that further
      developments systems may have explicit processing for sets,
      and N3 syntax could even be extended to include set literal
      syntax, which would of course make this easier.)
    </p>
    <h2>
      The quoting of URIs
    </h2>
    <p>
      The use of an explicit string as the URI for the subject
      above is also ungainly, compared with the use in the original
      N3 where a prefixed symbol can be used. Why is the string
      given explicitly, instead of writing it as symbol?
    </p>
    <p>
      Let's suppose for a moment that we just use the symbol, not
      the string for the URI:
    </p>
    <pre>#Wrong:
[ a rei:Statement;
  rei:subject ex:joe;
  rei:predicate ex:name;
  rei:object [rei:value "James Doe"]
] 
</pre>
    <p>
      This should be a description of an RDF statement. It must
      preserve the original graph, including the URIs it used. The
      statements which would be described as
    </p>
    <pre>[ rei:subject ex:joe;                        # Wrong
  rei:predicate ex:name;
  rei:object [rei:value "James Doe"]] 
</pre>
    <p>
      and
    </p>
    <pre>[ rei:subject ex:jd1;                         # Wrong
  rei:predicate ex:name;
  rei:object [rei:value "James Doe"]] 
</pre>
    <p>
      are different graphs, even if "http://example.com/ex#joe" and
      "http://example.com/ex#jd1" are two URIs for the same person.
      However, if the system knows that &lt;ex:jd1&gt; and
      &lt;ex:joe&gt; are in fact thhe same person, then the second
      statement can be derived from the first. It is important (in
      our application) to be able to know which name a graph used
      for something. The form of reification which is provided by
      the original RDF specification is not suitable, because it
      loses that information.
    </p>
    <h2>
      N3 Formulae
    </h2>
    <p>
      N3 extends RDF to allow graphs themselves to be another form
      of literal node. A graph can be quoted inside another graph,
      as one of the terms of a statement:
    </p>
    <pre>ex:jane  ex:knows   { ex:joe ex:name  "James Doe" }.
</pre>
    <p>
      Jane knows "joe's name is '<em>James Doe</em>'". As above,
      the quotation effect is important. Jane's knowledge is in
      these terms. Even though ex:jd1 and ex:joe may be the same
      person, Jane might not know that, and so may not know that
      ex:jd1's name is <em>James Doe</em>.
    </p>
    <p>
      An N3 formula also introduces quantification. Variables are
      introduced by allowing a given set of symbols to be
      universally quantified over the formula, and another set to
      be universally quantified.
    </p>
    <p>
      A formula is described by three sets: the set of statements
      (the graph), the set of universals and the set of
      existentials. The semantics of an N3 formula are that the
      universal quantification is applied to the result of applying
      the existential quantification to the conjunction of the
      statements. (a la <em>forall x: exists c such that ...</em>).
      The N3 formula
    </p>
    <pre>  @keywords a.
  [] a car. { ?x a car } =&gt; { ?x a vehicle }.
</pre>
    <p>
      (roughly, <em>There is a car. Anything which is a car is a
      vehicle</em>) is shorthand for
    </p>
    <pre>@keywords a.
@forAll :x.
  @forSome :c.
    :c a car.
    {x a car } =&gt; {x a vehicle}.
</pre>would be described as a formula whose universals were just x,
whose existentials were just c, and whose statements was the
implication - a statement whose subject and object were themselves
formulae. This follows in the code below, obtained by passing the
code above through <code>cwm --reify.</code> The output is:
    <pre>@prefix : &lt;http://www.w3.org/2004/06/rei#&gt; .
@prefix owl: &lt;http://www.w3.org/2002/07/owl#&gt; .
@keywords a.
    
[ a &lt;http://www.w3.org/2000/10/swap/log#Truth&gt;;
  universals[owl:oneOf(
                "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#x"  ) ];
  existentials [owl:oneOf(
                "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#c"  ) ];
  statement [ owl:oneOf([
     object [uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#car" ];
     predicate [uri "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" ];
     subject [uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#c" ]
               ]  [
     object [
         universals [ owl:oneOf () ] ];
         existentials [ owl:oneOf () ];
         statements [ owl:oneOf (
           [ object  [uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#vehicle" ];
             predicate [uri "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" ];
             subject[ uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#x" ] ] ) ];
     predicate [uri "http://www.w3.org/2000/10/swap/log#implies" ];
     subject  [
         universals  [owl:oneOf ()];
         existentials [owl:oneOf () ];
         statements  [owl:oneOf  (
           [ object [uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#car" ];
             predicate [uri "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" ];
             subject [uri "http://www.w3.org/2000/10/swap/test/reify/ex1.n3#x" ] ] ) ]] ] ) ] ].
    
</pre>
    <h2>
      Asserting truth
    </h2>
    <p>
      Note that in this mode, the formula is not only described,
      but it is also stated to be a Truth. To simply describe a
      formula as existing doesn't say anything. Formulae are
      abstract things, to say one exists doesn't add anything. Some
      would say, all formulae exist, just as all lists exist.
      However, to assert that one is true asserts its contents. The
      RDF file output above has, by definition of the terms in the
      reification namespace, the same meaning as the full N3
      formula from which it is produced. It does to any agent which
      understands the meaning of the reification namespace.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2011 00:00:00 GMT</pubDate>
  <title>Using Relative URIs</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Relative.html</link>
    <guid>https://www.w3.org/DesignIssues/Relative.html</guid>
      <description><![CDATA[
    <h1>
      Using Relative URIs
    </h1>

<p>URIs are so ubiquitous that they are now used in many 
different sorts of system.  There is 
a whole class of issues which arise when bits of the
web architecture are designed by those who are not aware
of the ways in which others use them, an end up coding or speccing
so as to break others' preferred design patterns.  One of these patterns
is the use of relative URIs.
</p><p>
</p><h2>Recap: Local IDs</h2>
<p>Localidentifiers as an important case of URIs. Elsewhere in the these notes there are more complete descriptions of the way the URI system works, and specifically as to how 
 the # character is used to take what was locally only a local identifier
 and by prepending the URI of the document in which that local identifier is used,
 turning it into a global identifier.  Hence, hypertext turns into
 the WWW, and data files turn into the Web of Data, and so on.
 We call it <a href="Webize.html">webizing</a> a system.
 Here is web architecture in one line:
 </p><pre> 
            &lt;global id of document&gt;  #  &lt;local id of whatever&gt;
 </pre>
 <p>So when a file on the web refers to something internal it can just use
 the local ID.  It would be very cumbersome to include the 
 whole URI of the document itself in each of these references.
  </p><pre> 
            #  &lt;local id of whatever&gt;
 </pre>
  <p></p>
 <ul>
 <li>This file can be copied and put up on any web server</li>
 <li>The file can therefore be used as a template for other new systems</li>
 <li>The fie is shorter and much easier to read than one in which all the URIS are spelled out in full.</li>
 </ul>
 
As an illustration of the last point, consider a program written in a
programming language.
Imagine writing a program, say, like: 
<pre>     pi = 3.14159265359;
     print (2 * pi);
</pre>
an it being saved in circles.py as
<pre>    &lt;file:///users/foo/programs/play/circles.py#pi&gt; = 3.14159265359;
    print (2 * &lt;file:///users/andys/programs/play/circles.py#pi&gt;);
</pre>
Not practical, not the sort of thing you can copy and move around.
The fact is that throughout computer systems, local identifiers and 
relative filenames are very common. 
We loose a lot if we drop this relative feature when we move to the web.
<p></p>


 <h2>Relative URIs</h2>
 
 <p>
 In fact, when a number of files all refer to each other
 such as a bunch of HTML, CSS, SVG and JS files as part of  
 w web application, or a set of Turtle files as part of a Linked Data 
 dataset, then also save a lot of space and fragility by using relative 
 URIs, using the unix-like conventions 
</p>
 
 <p>There are, though, relatively few systems
 which in fact can be implemented by a single file.
 There are a huge number in which the system is implemented in one
 directory of a file system, or in that and its descendants in the tree.
 For example, an HTML file may have local CSS files and images.
 
 The system may be edited in unix file space, where any URLs
 will be file:/// URLs, and it may at the same time by seen
 through a web server,
 where the various resources are referred to by relative URLs:
 </p><pre>    &lt;a href="intro.html"&gt;
     &lt;img src="../images/foo.png"/&gt;
    &lt;/a&gt;
 </pre>
 <p></p>
 <p>
It isn't just coincidence that the syntax of URLs matches that of
unix-like file systems.   It is designed to let systems be looked
at either with file:// glasses or with http:// glasses and still work.
This is very useful. 
This means that you can directly export onto the web
a system which has been built up as a local filesystem,
and it means that when you have a file-mapped web server, like Apache,
you can use all kinds of unix tools to build or analyze
your stuff.
For anyone who has been working this way for more a decade to two
this may be blindingly obvious, but if you 
work in a content-management system which is not a file-system-mapped
one, or you work with a file system which does not use the *ix
conventions, then this may not be second nature.
 </p>
 <p>
 What the Relative URI Pattern gives you then,
 is the ability for you system to work with different base addresses
 without being modified.
 </p><ul>
 <li>You might want to view a directory both locally as a file and also remotely 
 through a web server.</li>
 <li>You might want to develop the system on a test system, 
 with testing of the internal relationships, then move it
 to a production system.</li>
 <li>You might want a system to be visible
 as part of many web sites, for example a common login system or
 help system shared my many related websites</li>
 <li>You might provide parts of a web site
 by proxying another.  Maybe when it comes to editing files
 your main web server proxies though to a specialist server.</li>
 <li>You may want a build a system which you can hand to others
 to copy elsewhere.</li>
 </ul>
 
 <p></p>
and so on. 
 

 <p></p><p>
Many systems, including most of the ones I have built and use day to day, 
have many internal relative links but really are better 
built without a knowledge of their own base URI, in that stored URIs
are always relative.   Even if the software absolutizes them 
in processing them, as there are no absolute  URIs for the local identifiers.

This is not to say all systems are like this, but some are and they are an important
class.   
 </p><p>
 
 </p><p>


Note other reasons for relative URIs include readability, and storage space and transmission length.
 </p>
<h3>Best Practice</h3>
<p>Because an application developer is very likely to find it valuable
to use relative URIs, any software libraries which serialize web file formats such as 
HTML and Turtle must provide the option (or the default) to serialize  
using relative URIs.
</p>
<p>
When data is stored in files, for example on a web server, 
it is good practice to store it using relative URIs.
</p><p>
</p><p>When data is sent across the net, also, such as in HTTP, relative URIs 
should be used.
</p>
<p>
Yes, there are cases when people want to design systems without this properties,
so absolute URIs should be an option.
 </p>


<h4>Example</h4>

For example one project I had a bunch RDF and n3 rules, where there are
locally defined instances and local ontologies, and I process it
in file:// space much of the time, and browse it in http://
space though a web server, but when I edit it the http space
is actually proxies to a read-write-linked-data server on a different port.
 </p><p>

All the same-domain URIs within the system need to be relative.
 </p><p>

Attempting to introduce Jena-based code to this system is currently blocked
on the need for this bug to be resolved in Jena.
 </p><p>

For this reason, for example cwm's serializers use relative URIs, and even the 
RDF/XML one has an option to put relative URIs in namespaces. (The spec officially forbids relative URIs in the namespace part of XML.  This is a bug with XML.
The rdflib.js library uses relative URIs by default.
 </p><p>


<h3>Failure example</h3>

<p>The XML space when it defines how namespaces are declared in the top of the document, officially forbids relative URLs.  The authors envisages a narrow set of use cases in which the terms defined in the namespaces were all absolutely references and stable and global in nature, and the data in the document could be local and unstable.   Unfortunately they did not understand the general need to generate relative URIs in arbitrary situations. The cwm command line and the cwm serializer have an option to override this constraint and generate relative URIs.    A solution of course for RDF users is instead of RDF/XML to use the Turtle (nor N3) serialization, which is normally preferable anyway for many reasons.</p>

<p>The Jena RDF system does not (currently 2015)  include code to generate relative URIs, and when it serializes data into RDF/XML or turtle, it only generates absolute URIs.  This prevents it being used for a wide range of systems which require the use of relative URIs.</p>

<p>
There is a classic failure mode for RDF systems in which  developers bring up a system on test.acme.com and then move it to production.acme.com
and all the links break.  I know there are cases for absolute URIs but the relative URIs are
very important best practices.
 </p><p>






    </p>]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Jan 2000 00:00:00 GMT</pubDate>
  <title>Rules and facts: Inference engines and the Semantic Web</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Rules.html</link>
    <guid>https://www.w3.org/DesignIssues/Rules.html</guid>
      <description><![CDATA[
    <h1>
      Rules and Facts: Inference engines vs Web
    </h1>
    <p>
      At at attempt to explain explain part of the relationship
      between the Semantic Web and inference engines, either
      existing or legacy, and discuss the relationship between
      inference rules and logical facts.
    </p>
    <p>
      The Semantic Web is a universal space for anything which can
      be expressed in classical logic. In the world of knowledge
      Representation (KR) there are many different systems, and the
      following is an attempt to generalize.
    </p>
    <p>
      Each system typically has a distinction between data and
      rules. The data is a pool of information in one language
      (sometimes very simple without negation like basic RDF) . The
      rules control the inference steps which the inference engine
      makes. The rules are written in a restricted language so as
      to preserve some property computability property. Algernon
      restricts its rules to forward chaining but assures Socratic
      completeness.
    </p>
    <p>
      When integrating rules with the semantic web, one must
      realize that a rule contains two separate pieces of
      information. Take a rule in a certain inference system
    </p>
    <p>
      g(a,c) |= d(a,b) &amp; d(b,c)
    </p>
    <p>
      which is defined to mean "whenever you find a new
      relationship where any a is the daughter of some b, then if
      for that b there is any c for which b is the daughter of c,
      then conclude that a is the granddaughter of c". Here,
      "conclude" means add to the database. This is a procedural
      instruction.
    </p>
    <p>
      It involves an out-of band decision (may by a person) as to
      whether all granddaughter relationships should be added to
      the database the moment they can be, or whether the
      relationship would only be used at a time when a query is
      made. This rule can be exchanged between two inference
      engines of the same type, but it does not as a rule make
      sense to anyone else.
    </p>
    <p>
      In fact, of course, this rule would be nonsense if it were
      not for the fact in classical logic that
    </p>
    <p>
      Va,b,c g(a,c) &lt;= d(a,b) &amp; d(b,c)
    </p>
    <p>
      This fact, unlike the rule, can be directly expressed in the
      semantic web language. When the rule is used in deducing
      something, it is this fact which is a step input to the
      proof. Every semantic web proof validator will be able to
      handle it.
    </p>
    <p>
      Exposing rules as classic logic facts strips the
      (pragmatically useful) hint information which controls the
      actual sequence of operation of a local inference engine.
      When the facts corresponding to all the rules of all the
      inference engines are put onto the web, then the great thing
      is that all the knowledge is represented in the same space.
      The drawback is that there is no one inference engine which
      can answer arbitrary queries. But that is not a design goal
      of the semantic web. The goal is to unify everything which
      can be expressed in classical logic (including more
      mathematics when we get to it) without futher constraint. We
      must be able to describe hte world, and our hopes and needs
      and terms and conditions. A system which tries to constrain
      the expressive power cannot be universal.
    </p>
    <h2>
      Non-monotonic "logics"
    </h2>
    <p>
      Now there are some systems which in fact use classical logic
      directly, and others, "non-monotonic logics" in which adding
      a fact can change something which was previously "believed
      true" to being "believed false". (Describing them as logics
      may be regarded by some as questionable). For example, given
      that "birds can fly", the system will believe that Pingu can
      fly because Pingu is a penguin and a penguin is a bird,
      unltill it is told that penguins can't fly. Then it will
      assume that all birds can fly excpt for penguins. Such
      systems use concepts of "defaults" -- things to be assumed
      unless one is told otherwise. They are fundamentally
      closed-world systems, in that the concept of "belief" is
      alway implicitly make with respect to a given closed set of
      facts.
    </p>
    <p>
      One can export such information into the semantic web in two
      ways. One can export the rule system specifically, ending up
      with a statement of the form "there is as assertion of birds
      being able to fly which is is unchallenged in the xxxx corpus
      by any assertion contradicting that which applied to birds or
      any otehr superclass of penguins". This effectivly is a
      reification of the non-monotonic system, an analysis not of
      penguins but of the inferenc system and what its state is.
      This may be so unweildly that it is only useful by systems
      which use th same inference system. The second way to export
      the data is to just record the classical logic statement as
      the output of the inference engine. "The xxxx system has
      output that Pingu can fly.". In certian cases, a system might
      risk incorporating such statements into a classic inference
      system. This is the logical equivalent of declaring, "Well, I
      don't think such a book exists becase it wasn't in
      Blackwell's catalog". We do things all the time, but a secure
      system is unlikely to be set up to incorporate such
      information. (A more secure system would for example, given
      the publisher and year, find a definitive list from the
      publisher of books published in that year, which would allow
      it to proove that such a book did not exist.)
    </p>
    <p>
      The choice of classical logic for hte Semantic web is not an
      arbitrary choice among equals. Classical logic is the only
      way that inference can scale across the web. There are some
      logics which simply do not have a consistent set of axioms -
      fuzzy logic, for example, tends to believe something to a
      greater extent as a funcion of how often evidence for it has
      been presented. Closed world systems don't scale because the
      refernce to the scope of a defualt is typically implicit, and
      different from one fact to another. When a fact is presented
      as a fact, the "Oh yeah?" function of demanding justification
      can be satsfifed by a roof in a universal language of proof.
      non-classical heuristic systems may have been used to
      discover the proof, but onec the proof has been found it can
      by checked as valid by any semantic web system.
    </p>
    <p>
      In the diagram, I have put heuristic systems above the
      semnatic web bus, and classical systems below. In Weaving the
      Web later chapters I try to describe the importanc of the web
      in supporting both types of system.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 28 Mar 2015 00:00:00 GMT</pubDate>
  <title>Client-Side Certificates</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Security-ClientCerts.html</link>
    <guid>https://www.w3.org/DesignIssues/Security-ClientCerts.html</guid>
      <description><![CDATA[

  <h1>Web Security - Client side certs</h1>

  <div class="cols">
    <h2>Authenticating both sides is important</h2>

    <p>When you are browsing the news and interacting with incoming
    information, you may be happy not to be identified or even keen
    to be anonymous. There are times, though, when
    <a href="PublicIdentity.html">you want to be
    well identified yourself</a>, whether moving money around at the
    bank or chatting with friends. Normally, we do this with
    passwords.</p>

    <p>Public key cryptography is a really wonderful gift from
    mathematics. We should be making better use of it. Above I have
    sketched a few areas in which we can improve the way we use it
    to authenticate the server in this client-server system. We use
    it occasionally to authenticate the client. We should I think
    do that much more often.</p>

    <p>When I ask people about "client-side certs" (Public Key
    certificates used by the browser to authenticate the user), a
    typical response if they are going out of fashion, that they
    are difficult to use, and that browsers aren't really
    supporting them properly. But then if you ask people about
    passwords, then scream that they hate passwords. In fact, key
    pairs are just so much better than passwords in many ways.</p>

    <p>For passwords, you should think of a massively big one which
    is therefore <a href="http://xkcd.com/936/">impossible to
    remember</a> so you end up relying on your computer to remember
    it, and it should of course be a different one for each of the
    things you do business with, and the other party has store a
    copy of this, and your code has to send it across the net
    risking it being ensnared within your system or their system,
    and then they have to store it, which can be pretty insecure,
    to judge from history. Whereas with public keys, you generate a
    private key which is stupendously long and unguessable, use it
    for many different other parties, and you never have to send it
    to them, and they never see it or store it. That has got to be
    better. Yes, you can't scribble your secret key on a Post-It™;
    note but that is actually an advantage.</p>

    <p>These are the functions a browser needs to provide me as a user.</p>
    
    <ol>
<li>Create a new  public/private key pair, storing the private on the laptop (or on phone or hardware)
</li>
<li>Select which cert to use to log on to a given website, saving that preference, with a simple display but with enough information to be able to distinguish the certs.
</li>
<li>Display around the URL bar (on the right?) the identity with which I am logged in just as the bowser currently displays the identity of the site I am logged into
</li>
<li>Log out: Cancel any authentication in this browser or window, remove the credentials I currently have there, and optionally re-authenticate with a differnt identity.
</li>
<li>Manage those preferences of which cert I use for which site, or for this session.
</li>
<li>(For extra credit) Back up my private keys and sync them between devices.
</li></ol>
 <p>Here are some poinst to note.</p>
  

    <ul>
      <li>Remember my certificate choices for different web sites,
      (and allow me to edit the list in the preferences if
      necessary if I get muddled).</li>

      <li>Make the list of MY certificates much more friendly, when I
      have to chose one to use on a web site. Make sure the
      certificates are labelled in a way that makes them distinct
      -- not all just labelled with my name (duh!).     Where they differ in the party which has signed them, list that.
      Where they are webids, list the domain of the webid.
      Use more screen
      space. Allow me to hover or click to see more.</li>


      <li>Provide an API for a web app to determine which cert has
      been used by the user to access a given resource. (At the
      mo</li>

      <li>(detail) When I am creating a new private key using
      &lt;keygen&gt; or its future equivalent, within a web app, fire an event (or Promise) which the web
      app can listen to so that it can continue with the
      workflow.</li>

      <li>Allow me, if I sync my passwords between my machines, to
      chose which client side certs to sync between machines.
      Depending on the level of security, for many keys I will
      never want them to leave the one machine.</li>
    </ul>

    <h2>Just do it</h2>

    <p>Many of the things which I hear people wish for, mentioned
    in this article or not, are often put out into the conversation
    but with a sigh and the caveat that "It would be nice, but
    that's not how browsers work".</p>

    <p>The bowsers are code, in many cases open source code. They
    are being changed all the time. Unlike a decade ago, they are
    also being upgraded all the time, because of the need for
    security-related bug fixes, and so a new idea can be tried out
    in browsers really fast. When security is an issue, as it is
    for all the above, browser manufacturers are in a position to
    make changes. So let us not wring our hands over the fact
    changes we want to make are not how browsers work today. Let's
    just change it.</p>
  </div>
  ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 28 Mar 2015 00:00:00 GMT</pubDate>
  <title>Model Real Trust</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Security-ModelTrust.html</link>
    <guid>https://www.w3.org/DesignIssues/Security-ModelTrust.html</guid>
      <description><![CDATA[

  <h1>Web Security - Model Real Trust</h1>

  <div class="cols">
    <h2>Mapping machine trust to human trust</h2>

    <p>We must be able to be able to transfer our trust to our
    machines from our actual real world situation. In some cases,
    we trust (say) a bank because we trust a large ISP to give
    certificates out for big thing like banks. The PKI Certificate
    Authority chain deserves the business gets when it arbitrates
    identity on the global scale.</p>

    <p>Sometimes, for a bank, a lawyer, a real estate agent, or an
    accountant we are using, we sit down and have a face-face
    conversation at least at the start of our long-term
    relationship, spending a lot of time to establish that trust
    relationship, signing such as signing contracts. In this case,
    we have plenty of time to exchange keys securely as peers with
    no hierarchical infrastructure.</p>

    <p>Sometimes we will trust people because we know them
    personally, and then we must be able to explain this fact to
    the computer which acts as our agent. Sometimes we know people
    because they are part of a family, and we exchange keys within
    a family context and trust the family web server. Sometimes we
    know someone or a web site because we are part of an
    organization, like a university, or we are employees of a
    company.</p>

    <p>My computer (of whatever sort) must allow me to add specific
    certificates because I have these relationships. It is at the
    moment a breakage in the system that browser manufacturers are
    fighting tooth and nail to stop people doing anything useful
    with an unsigned or a self-signed certificate. This is, with
    all due respect, playing into the hands of the corporate trust
    structure, PKI, which of course benefits from being used to
    convey these trust paths. This is wholly inappropriate for
    several reasons. One is that actually trusting my family server
    though the PKI system is an expensive way to go. One is that
    actually trusting my family server though the PKI system is a
    misrepresentation of why I trust the family server. The PKI
    system has many weak points, and in fact is a security
    liability for me to use it for something where I should be
    using more intimate key exchange system.</p>Currently,
    mit.edu's certificates for working with staff are distributed
    by MIT to employees and are self-signed. I have a close
    relationship with MIT, so it is reasonable for me to accept
    this certificate. I can walk into MIT offices and hold them
    accountable. They already have issues me with an RFID card and
    a credit card and all kinds of stuff. I don't need to trust
    them because of a large set of root servers ion all kinds of
    countries which have from time been hacked. Unfortunately
    current rhetoric from browser manufacturers suggests that the
    user getting the browser to accept an self-signed cert is a
    horrible violation, and should be stopped. No, it should be
    enabled, and a great user interface provided for me, the user,
    to control and review the process.

    <p>The next two notes go on to explore two ways in which there
    are issue with the way we model trust, in the <a href="Security-Origin.html">gramularity of server-side web
    applications</a>, and the <a href="Security-ClientCerts.html">lack of support for client-side
    certificates</a>.</p>
    <hr>

    <p>This is the second of four related notes:</p>

    <ol>
      <li><a href="Security-NotTheS.html">"HTTPS Everywhere"
      considered harmful</a></li>

      <li><a href="Security-ModelTrust.html">Model Real
      Trust</a></li>

      <li><a href="Security-Origin.html">The Same Origin Policy -
      Origin Granularity</a></li>

      <li><a href="Security-ClientCerts.html">Client-Side
      Certificates</a></li>
    </ol>

    <h2>References</h2>

    <p><a href="https://ist.mit.edu/certificates">MIT IST
    certificate policy</a></p>
  </div>

  <p><a href="Overview.html">Up to Design Issues</a></p>

  <p><a href="../People/Berners-Lee">Tim BL</a></p>


]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 28 Mar 2015 00:00:00 GMT</pubDate>
  <title>&quot;HTTPS Everywhere&quot; considered harmful&gt;</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Security-NotTheS.html</link>
    <guid>https://www.w3.org/DesignIssues/Security-NotTheS.html</guid>
      <description><![CDATA[

  <h1>Web Security - TLS Everywhere, not https: URIs</h1>

  <div class="cols">
    The web is (in 2015) a place where security is increasing
    essential, and always under threat. It is also a space which
    needs to be consistent, logical, and user-serving. There follow
    some thoughts following many recent discussions of "HTTPS
    Everywhere" and points west.

    <h2>It's not the "S" in "https:"</h2>

    <p>A few years after HTTP appeared, around when W3C was founded
    in 1994, it was clear that an unencrypted and unauthenticated
    connection was too much of a liability for a lot of serious
    stuff, such as e-commerce, which everyone wanted to do on the
    web. (In those days, mass Deep Packet Inspection was not
    technically feasible, so the ubiquitous snooping which we have
    to day was not the main driver.) There were, among the ideas,
    two secure versions of HTTP proposed, one known as <a href="https://www.ietf.org/rfc/rfc2660.txt">S-HTTP</a> and the
    other, as HTTP-S. To cut a long story short, HTTP-S
    prevailed.</p>

    <p>There was a technical decision as to whether to make HTTPS
    protocol an extension of the existing HTTP protocol, used to
    look up URIs which started with "http:", or to give it its own
    URI prefix.</p>

    <p>When you look at that design choice, you have to remember
    that the URL is being used to communicate between two people,
    for example, the person who writes the link containing the href
    with the link, and the person who later sees the link and
    clicks ion it. Lets look at some of the arguments.</p>

    <table>
      <tbody><tr>
        <th>To use the existing http:</th>

        <th>Make a new https: URI prefix</th>
      </tr>

      <tr>
        <td>This gives the link follower the task of ensuring that
        the communications happen securely*</td>

        <td>This gives the person making the link a way to ensure
        that the communications of the link follower</td>
      </tr>

      <tr>
        <td>Allows a smooth upgrade of HTTP to be more secure
        HTTP</td>

        <td>Creates a separate space, a "secure web" in which only
        good things happen.</td>
      </tr>

      <tr>
        <td>Keeps the web one web</td>

        <td>Gets information about security levels confused with
        the identity of the resource.</td>
      </tr>
    </tbody></table>

    <p><i>* By "secure" I will normally mean in this article "with
    encryption and authentication".</i></p>

    <p>There may have been important other reasons and arguments,
    so the historian is invited to check the email archives, but
    looking back with 20 year hindsight and experience, it seems
    that the overriding concern must have been that someone making
    the link had the ability to insist that the link follower gets
    a secure experience. You imagine a bank wanting to print
    "https://bankofexample.com/foo" and be sure everyone who read
    it gets to the right bank, without being spied on or diverted,
    and has a secure session with them. The bank using the same
    prefix 'http:' would not give that assurance. It turns out not
    that was not the most important assurance to give.</p>

    <p>Now, an overriding concern is that the user who follows the
    link should be protected from being spied on, phished, scammed,
    or impersonated, and it is the browser's job to make that so,
    and, crucially, make the user the clearly aware of the level of
    security, and why they are trusting whom.</p>

    <p>What has changed? Well, Some people feel that in fact
    looking back the decision to make the https: URI space was in
    fact even at that time a mistake. Now also, you can argue that
    things have changed in that people are individually more aware,
    and individually under attack. It is not now the link maker's
    task to ensure the user is secure. It is the user's task to
    ensure that their interactions are secure.</p>

    <ul>
      <li>People now more understand that they want to have a
      secure communication, especially with a bank, but with most
      other places too. The browser must act on behalf of the
      bowser user, the person following the link, primarily.</li>

      <li>People have been trained to look not for the 's' in the
      URL bar (which conveys the URL only) but for the padlock
      which used to demonstrate a secure connection to somebody,
      and nowadays mercifully, the browser user interface which
      gives them the name of the holder of a validated certificate
      provided by the server.</li>

      <li>The 'http:' has in fact (2015) relatively recently been
      removed from the browser bar altogether for some browsers.
      While this infuriates those of us who are actually interested
      in it, it certainly puts another nail in the coffin of the
      idea that the 's' in the URI is important part of the
      architecture.</li>

      <li>"The HTTPS is a safe space" has lead to the notion in
      bowser vendors that it should be ring-fenced. The Same Origin
      Policy in this spirit suggests that once a user enters the
      secure web by an https: link, then everything which affects
      the session at all must come also over authenticated TLS.
      This has led to a class of web apps being broken, in contrast
      with the usual rule of back compatibility with old
      content.</li>
    </ul>(The last point is related to the common design failure
    that trust is as single-valued scalar thing. It has been more
    any more clear that we and our systems should not just trust
    things or not trust them, or even to trust them on a scale form
    0 to 1. We trust different people for different things. We
    trust one person for recommendations on food, and another for
    movies, and to muddle these trusts could be disastrous.
    Similarly we allow different agents and services and code
    modules do access different things for different purposes. Our
    computer systems must reflect and implement that. A https:
    secure oil/water boundary does not do that. A symptom if that
    you can never find the perfect place to put that boundary.)

    <h2>Don't break the web</h2>

    <p>There is a currently (2014, 15) a massive move to get the
    web secure in the sense of encrypted and authenticated. Of
    encryption and authentication, the encryption part is the part
    which has garnered the most attention, both among its promoters
    and those in governments <a href="http://www.theguardian.com/uk-news/2015/jan/12/david-cameron-pledges-anti-terror-law-internet-paris-attacks-nick-clegg">
    who</a> <a href="http://arstechnica.com/tech-policy/2014/10/us-top-cop-decries-encryption-demands-backdoors/">
    protest</a> against it has giving too much power to users,
    criminals included, compared with law enforcement. Projects
    such as <a href="https://letsencrypt.org/">LetsEncrypt</a> and
    the EFF's <a href="https://www.eff.org/Https-everywhere">HTTPS
    everywhere</a> for example promote a wholesale move to the
    HTTPS protocol.</p>

    <p>The concerns behind the need for security are valid. There
    is a lot of abuse which it would prevent. The problem with
    "https: everywhere"<sup><a href="#httpse">1</a></sup> drive is when the "s" is put into the URI. The
    problem is of course that moving things from http: space into
    https space, whether or not you keep the rest of the URI the
    same, breaks any links to. Put simply, the HTTPS Everywhere
    campaign taken at face value completely breaks the web. In a
    way it is arguably a greater threat to the integrity for the
    web than anything else in its history. The underlying speeds of
    connection of increased from 300bps to 300Gbps, IPv4 has being
    moved to IpV6, but none of this breaks the web of links in so
    doing.</p>

    <h2>TLS Everywhere</h2>

    <p>A proposal then is to do HTTPS everywhere in the sense of
    the protocol but <b>not the URI prefix</b>. A browser gives the
    secure-looking user interface message, such as displaying the
    server certificate holder name above the document, only when
    the document has been fetched in an authenticated over an
    encrypted channel. This can be done by upgrading the HTTP to
    include TLS in real time, or in future cases by just trying
    encrypted version first. There has been some discussion of this
    from including a <a href="https://www.ietf.org/rfc/rfc2817.txt">RFC2817</a> (2000) "HTTP
    Upgrade to TLS" (Though that was motivated apparently by the
    need to save low-numbered ports, an issue I omitted from the
    table above.).</p>

    <p>The HTTP protocol can and by default is upgraded to use TLS
    without having to use a different URI prefix. The https: prefix
    could even in fact be phased out, and instead user education
    focussed on understanding the level of assurance being given
    about the level of security, including authentication of the
    other party, encryption of the communication, and the
    anonymity, traceability, or strong authentication of the user
    to the other party.</p>
    <hr>

    <p>This is the first of four related notes:</p>

    <ol>
      <li><a href="Security-NotTheS.html">"HTTPS Everywhere"
      considered harmful</a></li>

      <li><a href="Security-ModelTrust.html">Model Real
      Trust</a></li>

      <li><a href="Security-Origin.html">The Same Origin Policy -
      Origin Granularity</a></li>

      <li><a href="Security-ClientCerts.html">Client-Side
      Certificates</a></li>
    </ol>

    <h2>References</h2>
    
    <p><sup><a id="httpse">1</a></sup> Note this we are talking about the general drive for "https:" URLs everywhere, not the specific
    "<a href="https://www.eff.org/https-everywhere">HTTPS EVerywhere</a>"
    project of <a href="https://www.eff.org/">EFF</a> which
    looks for examples of things which availenble in both
    spaces in different places, generated a map of these, 
    and a browser extension to make the bowsere authmatically
    redirect from the http: to the https: version.
    </p>

    <p class="tbd"></p>

    <ul>
      <li>Email discussion on www-tag@w3.orgat many times
      including <a href="https://lists.w3.org/Archives/Public/www-tag/2014Dec/thread.html#msg128">
      December</a>, <a href="https://lists.w3.org/Archives/Public/www-tag/2015Jan/thread.html">
      January 2015</a>, etc.</li>

      <li>Mike Specter &lt;specter@mit.edu&gt; private
      communication.</li>

      <li><a href="https://www.eff.org/Https-everywhere">HTTPS
      Everywhere</a> is a Firefox, Chrome, and Opera extension that
      switched your from http: addresses to equivalent https:
      addresses where they are available, using a map in its
      configuration. EFF</li>

      <li><a href="https://letsencrypt.org/">Lets'Encrypt</a> is a
      project to make a new Certificate Authority (CA) which will
      make it easy and free to get a server certificate
      automatically by proving you control a given domain.</li>

      <li><a href=""></a></li>
    </ul>
  </div>

  <p><a href="Overview.html">Up to Design Issues</a></p>

  <p><a href="../People/Berners-Lee">Tim BL</a></p>


]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 28 Mar 2015 00:00:00 GMT</pubDate>
  <title>The Same Origin Policy - Origin Granularity</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Security-Origin.html</link>
    <guid>https://www.w3.org/DesignIssues/Security-Origin.html</guid>
      <description><![CDATA[

  <h1>Granularity of web sites, trust and the Same Origin
  Policy.</h1>

  <div class="cols">
    <p>You can't discuss authentication of parties without
    discussing what constitutes a party. In the current web, the
    domain name clearly distinguishes parties. The current Same
    Origin Policy not only uses domain names as principals, but it
    also assumes a hierarchy among them, in that for example
    csail.mit.edu is deemed to subordinate to mit.edu. That is, to
    be simplistic, when looking the rights os applications and the
    protection of user privacy, it is assumed that a mit.edu app
    should be able to access data from scripts from csail.mit.edu,
    but not the other way around.</p>

    <p>Note that while the URI space has always had also a
    hierarchical form in the path part of the URI, the slashes,
    that is currently <b>not</b> used. This means that f your
    website had a structure like</p>
    <pre>https://camps.org/campsunshine/families/smith/jane/stuff
</pre>in order to give Jane Smith a space to play with her
scripting enabled, the URL needs to be changed to something more
like
    <pre>https://jane.smith.families.campsunshine.camps.org/stuff
</pre>What's wrong with this picture?

    <ul>
      <li>It forces the URLs all to be broken as they are
      changed.</li>

      <li>It removes the ability to be able to use relative URLs
      around related parts. Links between the family web and Jane's
      bit can't just be like "jane/stuff" they have to be absolute.
      This means the family can't zip up their files and run them
      on the home server another day, or develop them in file:
      space to seem what they look like offline, etc</li>

      <li>It invites the system to be changed multiple times</li>

      <li>It reveals to anyone looking a reader's DNS requests, not
      just that the reader is going to the innocuous camps.org site
      but to the web site of notorious activist Jane Smith. The
      reader's privacy is being whittled away a little more.</li>
    </ul>

    <p>In general, it is dangerous when the URI is made opaque in
    any way. Sometimes it is necessary -- the best of many bad
    choices. An example is the 's' in HTTPs. (The massive
    opaqueness is indeed the domain name, as that is the path for
    the trust that the user has in trusting the session at
    all.)</p>

    <p>It was a very early design decision to make the hierarchical
    nature of a URI transparent, using "/". There is another lack
    of opaqueness in the hierarchical nature of the path part of
    the URI. The problem with the same origin policy as it is is
    that it is using a different, previously largely unused that
    the idea of the "/" was just such a hierarchical syntax. In
    many web bowsers like the classic Apache, the URL space maps
    directly to chunks of the unix file system. This is deliberate
    as many good things come with the unix file system. One is
    hierarchical delegation. The way unix file permissions work,
    things are inherited down the tree. The way .htaccess files are
    used within the server's file tree is completely hierarchical,
    in that a .htaccess point at a given point can control things
    below it in th tree but not above it. This is a very useful
    feature. It allows things to be embedded within the system. It
    allows a [reverse] proxy within the web site to map arbitrary
    branches of the tree onto internal or external services.</p>

    <p>For example, when the camp.org board meets they can use one
    of the popular meeting coordination tools proxied into the
    tree:</p>
    <pre>https://camps.org/board/AcmeDiscss/stuff
</pre>Here it would be valuable to make sure that the AcmeDiscss
outside service can't so a cross-site attack to read all the board
papers in
    <pre>https://camps.org/board/papers
</pre>The browsers are not currently programed to do that. Instead
one has to set up like
    <pre>https://AcmeDiscss.camps.org/
</pre>or
    <pre>https://AcmeDiscss.board.camps.org/
</pre>when in fact what is probably set up is in practice
    <pre>https://campsboard.AcmeDiscss.com/
</pre>where there in no reference at all to the actual trust
structure, and the camps.org board have to trust that any
AcmeDiscss.com scripts can read their data.

    <p>We should investigate ways of allowing trust to pass in the
    / hierarchy of the path as well as the domain name. We should
    also move away from more reliance on domain names in general in
    the web.</p>
    <hr>

    <p>This is the 2nd of 4 related notes:</p>

    <ol>
      <li><a href="Security-NotTheS.html">"HTTPS Everywhere"
      considered harmful</a></li>

      <li><a href="Security-Origin.html">The Same Origin Policy -
      Origin Granularity</a></li>

      <li><a href="Security-ModelTrust.html">Model Real
      Trust</a></li>

      <li><a href="Security-ClientCerts.html">Client-Side
      Certificates</a></li>
    </ol>

    <h2>References</h2>

    <p class="tbd">@@ Pointers to email threads on www-tag
    etc<br>
    [2] Mike Specter &lt;specter@mit.edu&gt; private
    communication.</p>
  </div>

  <p>This is the 2nd of 4 related notes:</p>

  <ol>
    <li><a href="Security-NotTheS.html">"HTTPS Everywhere"
    considered harmful</a></li>

    <li><a href="Security-Origin.html">The Same Origin Policy -
    Origin Granularity</a></li>

    <li><a href="Security-ModelTrust.html">Model Real
    Trust</a></li>

    <li><a href="Security-ClientCerts.html">Client-Side
    Certificates</a></li>
  </ol>

  <p><a href="Overview.html">Up to Design Issues</a></p>

  <p><a href="../People/Berners-Lee">Tim BL</a></p>


]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 09 Sep 1998 00:00:00 GMT</pubDate>
  <title>A roadmap to the Semantic Web</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Semantic.html</link>
    <guid>https://www.w3.org/DesignIssues/Semantic.html</guid>
      <description><![CDATA[
    <h1>
      Semantic Web Road map
    </h1>
    <p>
      <i>A road map for the future, an architectural plan untested
      by anything except thought experiments.</i>
    </p>
    <p>
      This was written as part of a requested road map for future
      Web design, from a level of 20,000ft. It was spun off from an
      Architectural overview for an area which required more
      elaboration than that overview could afford.
    </p>
    <p>
      Necessarily, from 20,000 feet, large things seem to get a
      small mention. It is architecture, then, in the sense of how
      things hopefully will fit together. So we should recognize
      that while it might be slowly changing, this is also a living
      document.
    </p>
    <p>
      This document is a plan for achieving a set of connected
      applications for data on the Web in such a way as to form a
      consistent logical web of data (semantic web).
    </p>
    <h3>
      <a name="Introduction" id="Introduction">Introduction</a>
    </h3>
    <p>
      The Web was designed as an information space, with the goal
      that it should be useful not only for human-human
      communication, but also that machines would be able to
      participate and help. One of the major obstacles to this has
      been the fact that most information on the Web is designed
      for human consumption, and even if it was derived from a
      database with well defined meanings (in at least some terms)
      for its columns, that the structure of the data is not
      evident to a robot browsing the web. Leaving aside the
      artificial intelligence problem of training machines to
      behave like people, the Semantic Web approach instead
      develops languages for expressing information in a machine
      processable form.
    </p>
    <p>
      This document gives a road map - a sequence for the
      incremental introduction of technology to take us, step by
      step, from the Web of today to a Web in which machine
      reasoning will be ubiquitous and devastatingly powerful.
    </p>
    <p>
      It follows the note on the <a href="Architecture.html">architecture</a> of the Web, which
      defines existing design decisions and principles for what has
      been accomplished to date.
    </p>
    <h2>
      <a name="SemanticWeb" id="SemanticWeb">Machine-Understandable
      information: Semantic Web</a>
    </h2>
    <p>
      The Semantic Web is a web of data, in some ways like a global
      database. The rationale for creating such an infrastructure
      is given elsewhere [Web future talks &amp;c] here I only
      outline the architecture as I see it.
    </p>
    <h2>
      <a name="Assertion" id="Assertion">The basic assertion
      model</a>
    </h2>
    <p>
      When looking at a possible formulation of a universal Web of
      semantic assertions, the principle of minimalist design
      requires that it be based on a common model of great
      generality. Only when the common model is general can any
      prospective application be mapped onto the model. The general
      model is the Resource Description Framework.
    </p>
    <p>
      <i>See the</i> <a href="../TR/WD-rdf-syntax/"><i>RDF Model
      and Syntax Specification</i></a>
    </p>
    <p>
      Being general, this is very simple. Being simple there is
      nothing much you can do with the model itself without
      layering many things on top. The basic model contains just
      the concept of an <b>assertion</b>, and the concept of
      <b>quotation</b> - making assertions about assertions. This
      is introduced because (a) it will be needed later anyway and
      (b) most of the initial RDF applications are for data about
      data ("metadata") in which assertions about assertions are
      basic, even before logic. (Because for the target
      applications of RDF, assertions are part of a description of
      some resource, that resource is often an implicit parameter
      and the assertion is known as a <b>property</b> of a
      resource).
    </p>
    <p>
      As far as mathematics goes, the language at this point has no
      negation or implication, and is therefore very limited. Given
      a set of facts, it is easy to say whether a proof exists or
      not for any given question, because neither the facts nor the
      questions can have enough power to make the problem
      intractable.
    </p>
    <p>
      Applications at this level are very numerous. Most of the
      <a href="Architecture.html#Metadata">applications for the
      representation of metadata</a> can be handled by RDF at this
      level. Examples include card index information (the Dublin
      Core), Privacy information (P3P), associations of style
      sheets with documents, intellectual property rights labeling
      and PICS labels. We are talking about the representation of
      data here, which is typically simple: not languages for
      expressing queries or inference rules.
    </p>
    <p>
      RDF documents at this level do not have great power, and
      sometimes it is less than evident why one should bother to
      map an application in RDF. The answer is that we expect this
      data, while limited and simple within an application, to be
      combined, later, with data from other applications into a
      Web. Applications which run over the whole web must be able
      to use a common framework for combining information from all
      these applications. For example, access control logic may use
      a combination of privacy and group membership and data type
      information to actually allow or deny access. Queries may
      later allow powerful logical expressions referring to data
      from domains in which, individually, the data representation
      language is not very expressive. The purpose of this document
      is partly to show the plan by which this might happen.
    </p>
    <h2>
      <a name="Schema" id="Schema">The Schema layer</a>
    </h2>
    <p>
      The basic model of the RDF allows us to do a lot on the
      blackboard, but does not give us many tools. It gives us a
      model of assertions and quotations on which we can map the
      data in any new format.
    </p>
    <p>
      We next need a schema layer to declare the existence of new
      property. We need at the same time to say a little more about
      it. We want to be able to constrain the way it used.
      Typically we want to constrain the types of object it can
      apply to. These meta-assertions make it possible to do
      rudimentary checks on a document. Much as in SGML the "DTD"
      allows one to check whether elements have been used in
      appropriate positions, so in RDF a schema will allow us to
      check that, for example, a driver's license has the name of a
      person, and not a model of car, as its "name".
    </p>
    <p>
      It is not clear to me exactly what primitives have to be
      introduced, and whether much useful language can be defined
      at this level without also defining the next level. There is
      currently a <a href="http://www.w3.org/RDF/Group/Schema/">RDF
      Schema working group</a> in this area. The schema language
      typically makes simple assertions about permitted
      combinations. If the SGML DTD is used as a model, the schema
      can be in a language of very limited power. The constraints
      expressed in the schema language are easily expanded into a
      more powerful logical layer expressions (the next layer), but
      one chose at this point, in order to limit the power, not to
      do that. For example: one can say in a schema that a property
      foo is unique. Expanded, that is that for any x, if y is the
      foo of x, and z is the foo of x, then y equals z. This uses
      logical expressions which are not available at this level,
      but that is OK so long as the schema language is, for the
      moment, going to be handled by specialized schema engines
      only, not by a general reasoning engine.
    </p>
    <p>
      When we do this sort of thing with a language - and I think
      it will be very common - we must be careful that the language
      is still well defined logically. Later on, we may want to
      make inferences which can only be made by understanding the
      semantics of the schema language in logical terms, and
      combining it with other logical information.
    </p>
    <h2>
      <a name="Conversion" id="Conversion">Conversion language</a>
    </h2>
    <p>
      A requirement of namespaces work for <a href="Evolution.html">evolvability</a> is that one must, with
      knowledge of common RDF at some level, be able to follow
      rules for converting a document in one RDF schema into
      another one (which presumably one has an innate understanding
      of how to process).
    </p>
    <p>
      By the principle of least power, this language can in fact be
      made to have implication (inference rules) without having
      negation. (This might seem a fine point to make, when in fact
      one can easily write a rule which defines inference from a
      statement A of another statement B which actually happens to
      be false, even though the language has no way of actually
      stating "False". However, still formally the language does
      not have the power needed to write a paradox, which comforts
      some people. In the following, though, as the language gets
      more expressive, we rely not on an inherent ability to make
      paradoxical statements, but on applications specifically
      limiting the expressive power of particular documents.
      Schemas provide a convenient place to describe those
      restrictions.)
    </p>
    <p>
      <img src="diagrams/zipcode.png" alt="Links between the table for Emp" align="left">A simple
      example of the application of this layer is when two
      databases, constructed independently and then put on the web,
      are linked by semantic links which allow queries on one to
      converted into queries on another. Here, someone noticed that
      "where" in the <em>friends</em> table and "zip" in a
      <em>places</em> table mean the same thing. Someone else
      documented that "zip" in the <em>places</em> table meant the
      same things as "zip" in the <em>employees</em> table, and so
      on as shown by arrows. Given this information, a search for
      any employee called Fred with zip 02139 can be widened from
      <em>employees</em> to in include <em>friends</em>. All that
      is needed some RDF "equivalent" property.
    </p>
    <h2>
      <a name="Logical" id="Logical">The logical layer</a>
    </h2>
    <p>
      The next layer, then is the logical layer. We need ways of
      writing logic into documents to allow such things as, for
      example, rules the deduction of one type of document from a
      document of another type; the checking of a document against
      a set of rules of self-consistency; and the resolution of a
      query by conversion from terms unknown into terms known.
      Given that we have quotation in the language already, the
      next layer is predicate logic (not, and, etc) and the next
      layer quantification (for all x, y(x)).
    </p>
    <p>
      The applications of RDF at this level are basically limited
      only by the imagination. A simple example of the application
      of this layer is when two databases, constructed
      independently and then put on the web, are linked by semantic
      links which allow queries on one to converted into queries on
      another. Many things which may have seemed to have needed a
      new language become suddenly simply a question of writing
      down the right RDF. Once you have a language which has the
      great power of predicate calculus with quotation, then when
      defining a new language for a specific application, two
      things are required:
    </p>
    <ul>
      <li>One must settle on the (limited) power of the reasoning
      engine which the receiver must have, and define a subset of
      full RDF which will be expected to be understood;
      </li>
      <li>One will probably want to define some abbreviated
      functions to efficiently transmit expressions within the set
      of documents within the constrained language.
      </li>
    </ul>
    <p>
      <i>See also, if unconvinced:</i>
    </p>
    <ul>
      <li>
        <a href="RDFnot.html"><i>What the Semantic Web is
        not</i></a> - answering some FAQs
      </li>
    </ul>
    <p>
      The metro map below shows a key loop in the semantic web. The
      Web part, on the left, shows how a URI is, using HTTP, turned
      into a representation of a document as a string of bits with
      some MIME type. It is then parsed into XML and then into RDF,
      to produce an RDF graph or, at the logic level, a logical
      formula. On the right hand side, the Semantic part, shows how
      the RDF graph contains a reference to the URI. It is the
      trust from the key, combined with the meaning of the
      statements contained in the document, which may cause a
      Semantic Web engine to dereference another URI.
    </p>
    <p>
      <img src="diagrams/loop.gif" alt="URI gets document which a parse">
    </p>
    <h3>
      <a name="Validation" id="Validation">Proof Validation - a
      language for proof</a>
    </h3>
    <p>
      The RDF model does not say anything about the form of
      reasoning engine, and it is obviously an open question, as
      there is no definitively perfect algorithm for answering
      questions - or, basically, finding proofs. At this stage in
      the development of the Semantic Web, though, we do not tackle
      that problem. Most applications construction of a proof is
      done according to some fairly constrained rules, and all that
      the other party has to do is validate a general proof. This
      is trivial.
    </p>
    <p>
      For example, when someone is granted access to a web site,
      they can be given a document which explains to the web server
      why they should have access. The proof will be a chain [well,
      DAG] of assertions and reasoning rules with pointers to all
      the supporting material.
    </p>
    <p>
      The same will be true of transactions involving privacy, and
      most of electronic commerce. The documents sent across the
      net will be written in a complete language. However, they
      will be constrained so that, if queries, the results will be
      computable, and in most cases they will be proofs. The HTTP
      "GET" will contain a proof that the client has a right to the
      response. the response will be a proof that the response is
      in deed what was asked for.
    </p>
    <h3>
      <a name="Inference" id="Inference">Evolution rules
      Language</a>
    </h3>
    <p>
      RDF at the logical level already has the power to express
      inference rules. For example, you should be able to say such
      things as "If the zipcode of the organization of x is y then
      the work-zipcode of x is y". As noted above, just scattering
      the Web with such remarks will in the end be very
      interesting, but in the short term won't produce repeatable
      results unless we restrict the expressiveness of documents to
      solve particular application problems.
    </p>
    <p>
      Two fundamental functions we require RDF engines to be able
      to do are
    </p>
    <ol>
      <li>for a version <i>n</i> implementation to be able to read
      enough RDF schema to be able to deduce how to read a version
      <i>n+1</i> document;
      </li>
      <li>for a type A application developed quite independently of
      a type B application which has the same or similar function
      to be able to read and process enough schema information to
      be able to process data from the type B application.
      </li>
    </ol>
    <p>
      (See <a href="Evolution.html">evolvability article</a>)
    </p>
    <p>
      The RDF logic level is sufficient to be usable as a language
      for making inference rules. Note it does not address the
      heuristics of any particular reasoning engine, which which is
      an open field made all the more open and fruitful by the
      Semantic Web. In other words, RDF will allow you to write
      rules but won't tell anyone at this stage in which order to
      apply them.
    </p>
    <p>
      Where for example a library of congress schema talks of an
      "author", and a British Library talks of a "creator", a small
      bit of RDF would be able to say that for any person x and any
      resource y, if x is the (LoC) author of y, then x is the (BL)
      creator of y. This is the sort of rule which solves the
      evolvability problems. Where would a processor find it? In
      the case of a program which finds a version 2 document and
      wants to find the rules to convert it into a version 1
      document, then the version 2 schema would naturally contain
      or point to the rules. In the case of retrospective
      documentation of the relationship between two independently
      invented schemas, then of course pointers to the rules could
      be added to either schema, but if that is not (socially)
      practical, then we have another example of the the annotation
      problem. This can be solved by third party indexes which can
      be searched for connections between two schemata. In practice
      of course search engines provide this function very
      effectively - you would just have to ask a search engine for
      all references to one schema and check the results for rules
      which like the two.
    </p>
    <h3>
      <a name="Query" id="Query">Query languages</a>
    </h3>
    <p>
      One is a query language. A query can be thought of as an
      assertion about the result to be returned. Fundamentally, RDF
      at the logical level is sufficient to represent this in any
      case. However, in practice a query engine has specific
      algorithms and indexes available with which to work, and can
      therefore answer specific sorts of query.
    </p>
    <p>
      It may of course in practice to develop a vocabulary which
      helps in either of two ways:
    </p>
    <ol>
      <li>It allows common powerful query types to be expressed
      succinctly with fewer pages of mathematics, or
      </li>
      <li>It allows certain constrained queries to be expressed,
      which are interesting because they have certain computability
      properties.
      </li>
    </ol>
    <p>
      SQL is an example of a language which does both.
    </p>
    <p>
      It is clearly important that the query language be defined in
      terms of RDF logic. For example, to query a server for the
      author of a resource, one would ask for an assertion of the
      form "x is the author of p1" for some x. To ask for a
      definitive list of all authors, one would ask for a set of
      authors such that any author was in the set and everyone in
      the set was an author. And so on.
    </p>
    <p>
      In practice, the diversity of algorithms in search engines on
      the web, and proof-finding algorithms in pre-web logical
      systems suggests that there will in a semantic web be many
      forms of agent able to provide answers to different forms of
      query.
    </p>
    <p>
      One useful step the specification of specific query engines
      for for example searches to a finite level of depth in a
      specified subset of the Web (such as a web site). Of course
      there could be several alternatives for different occasions.
    </p>
    <p>
      Another metastep is the specification of a query engine
      description language -- basically a specification of the sort
      of query the engine can return in a general way. This would
      open the door to agents chaining together searches and
      inference across many intermediate engines.
    </p>
    <h2>
      <a name="Signature" id="Signature">Digital Signature</a>
    </h2>
    <p>
      Public key cryptography is a remarkable technology which
      completely changes what is possible. While one can add a
      digital signature block as decoration on an existing
      document, attempts to add the logic of trust as icing on the
      cake of a reasoning system have to date been restricted to
      systems limited in their generality. For reasoning to be able
      to take trust into account, the common logical model requires
      extension to include the keys with which assertions have been
      signed.
    </p>
    <p>
      Like all logic, the basis of this, may not seem appealing at
      first until one has seen what can be built on top. This basis
      is the introduction of keys as first class objects (where the
      URI can be the literal value of a public key), and a the
      introduction of general reasoning about assertions
      attributable to keys.
    </p>
    <p>
      In an implementation, this means that reasoning engine will
      have to be tied to the signature verification system .
      Documents will be parsed not just into trees of assertions,
      but into into trees of assertions about who has signed what
      assertions. Proof validation will, for inference rules, check
      the logic, but for assertions that a document has been
      signed, check the signature.
    </p>
    <p>
      The result will be a system which can express and reason
      about relationships across the whole range of public-key
      based security and trust systems.
    </p>
    <p>
      Digital signature becomes interesting when RDF is developed
      to the level that a proof language exists. However, it can be
      developed in parallel with RDF for the most part.
    </p>
    <p>
      In the W3C, input to the digital signature work comes from
      many directions, including experience with DSig1.0 signed
      "pics" labels, and various submissions for digitally signed
      documents.
    </p>
    <h3>
      <a name="Indexes" id="Indexes">Indexes of terms</a>
    </h3>
    <p>
      Given a worldwide semantic web of assertions, the search
      engine technology currently (1998) applied to HTML pages will
      presumably translate directly into indexes not of words, but
      of RDF objects. This itself will allow much more efficient
      searching of the Web as though it were one giant database,
      rather than one giant book.
    </p>
    <p>
      The Version A to Version B translation requirement has now
      been met, and so when two databases exist as for example
      large arrays of (probably virtual) RDF files, then even
      though the initial schemas may not have been the same, a
      retrospective documentation of their equivalence would allow
      a search engine to satisfy queries by searching across both
      databases.
    </p>
    <h2>
      <a name="Engines" id="Engines">Engines of the Future</a>
    </h2>
    <p>
      While search engines which index HTML pages find many answers
      to searches and cover a huge part of the Web, then return
      many inappropriate answers. There is no notion of
      "correctness" to such searches. By contrast, logical engines
      have typically been able to restrict their output to that
      which is provably correct answer, but have suffered from the
      inability to rummage through the mass of intertwined data to
      construct valid answers. The combinatorial explosion of
      possibilities to be traced has been quite intractable.
    </p>
    <p>
      However, the scale upon which search engines have been
      successful may force us to reexamine our assumptions here. If
      an engine of the future combines a reasoning engine with a
      search engine, it may be able to get the best of both worlds,
      and actually be able to construct proofs in a certain number
      of cases of very real impact. It will be able to reach out to
      indexes which contain very complete lists of all occurrences
      of a given term, and then use logic to weed out all but those
      which can be of use in solving the given problem.
    </p>
    <p>
      So while nothing will make the combinatorial explosion go
      away, many real life problems can be solved using just a few
      (say two) steps of inference out on the wild web, the rest of
      the reasoning being in a realm in which proofs are give, or
      there are constrains and well understood computable
      algorithms. I also expect a string commercial incentive to
      develop engines and algorithms which will efficiently tackle
      specific types of problem. This may involve making caches of
      intermediate results much analogous to the search engines'
      indexes of today.
    </p>
    <p>
      Though there will still not be a machine which can guarantee
      to answer arbitrary questions, the power to answer real
      questions which are the stuff of our daily lives and
      especially of commerce may be quite remarkable.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 01 Jan 2004 00:00:00 GMT</pubDate>
  <title>The Semantic Clipboard</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/SemanticClipboard.html</link>
    <guid>https://www.w3.org/DesignIssues/SemanticClipboard.html</guid>
      <description><![CDATA[
    <h1>
      Semantic Clipboard
    </h1>
    <p>
      One way of looking at the Semantic Web is a breaking down of
      barriers between applications. An example I have often quoted
      is that I would like to be able to drag a photo album onto a
      calenar application and see the photos on my calendar.
    </p>
    <p>
      What would this actually take?
    </p>
    <p>
      The intersting case that the calendar application and the
      phto album application were written indepedently, without
      this case in mind. Suppose they were, though written to be
      semantic-web aware. So they will have ontologies for the
      things they deal with (photos, events resepctively in this
      case.) They will store their data, or at least a copy of it,
      in RDF.
    </p>
    <p>
      The clipboard, which stores things which we copy and paste,
      is a complex thing. It doesn't always hold the value of
      something clipped, it sometimes just remembers where it was
      clipped from. There is a form of negotiation between the
      source and destination about the format for the data to be
      transferred. This is why you can copy from text from a web
      page (which is structured hypertext) into a plain text
      message. There is a negotiation, and the source and
      destinatio can both use a plain text clipboad type.
    </p>
    <p>
      In this example, we can imagine there being a Semantic Web
      clipboard type. The data is basically transferred as an RDF
      graph. But that isn't the end of it. Different applications
      understand different vocabularies (ontologies). So a semantic
      web clipboard (or the application) does more than just
      transfer data. It arranges to convert it into a useful form.
    </p>
    <p>
      When you drag something onto the calendar application, it may
      be expecting events. It may be able to use anything which has
      at least a start datetime and and a description. the typical
      vocabulary here is iCalendar-like, such as @@@@. The photo
      has a date of creation and may have a form of description. It
      will typically have technical details of the exposure. The
      typical vocabulary here is EXIF-like, such as @@@@.
    </p>
    <p>
      RDF Interest group people have looked at conversion tools. At
      MIT we've had fun converting things like this deliberately.
      How can it happen in this example?
    </p>
    <p>
      Somehow, the user must authorize conversion rules to be
      available to the semantic web clipboard, so that the
      convertion can be done. The clipboard indexes the rules,
      knows what form of information is needed by the application
      through some kind of registration, and knows what sort of
      information is available on the clipboard.
    </p>
    <p>
      Typically, these things are customized. I might like the
      description of the event of a phtograph being taken to have a
      list of the people in the shot, if it is known; others might
      just want the brief "Pic!".
    </p>
    <h2>
      Sources and organization of rules
    </h2>
    <p>
      An advanced user who uses some kind of tool to generate a
      rule, as users do today for email filtering. The user is a
      third party. Third party rule sources may include system
      administrators.
    </p>
    <p>
      The source could be the creator of one or other program:
      calendar or photo album. One might exect the program which is
      released second to come with rules for connection to other
      things already released. In this case, there is some order in
      that the rules link applictions in a directed way. If in
      laterreleases both applications offer rules to convert the
      same way, then a choice has to be made, just as one sometiems
      has to chose whether to trust the provider of the printer or
      the provider of the operating system when installing a
      printer driver.
    </p>
    <p>
      Users will have to track the trust worthiness of many
      different sources of data, but these rules will give a lot
      back for something quite small.
    </p>
    <p>
      There is a form of entropy which increases as these rules are
      used. Some rules may be reversable, but typically they are
      not. You can't turn every event back into a picture, and you
      can't event turn a picture taking event back into the picture
      as you have lost some data. An event is in this example a
      more generic thing than a picture. One might assume that the
      system will in general try to reduce this information loss.
      This will involve trading at the highest level, or usingthe
      fewest rules. The sense of specificity might therefore form
      an organizing technique. This is similar to a form of pecking
      order between a rich text clipboard and a plain text
      clipboard: is applications can use the more sophisticated,
      less information is lost.
    </p>
    <h2>
      Conclusion
    </h2>
    <p>
      The Semantic Web clipboard might be a nifty hack in the short
      term, might be a mainstay of desktop interoperability in the
      future. From the reseacrh point of view, the rule management
      involved is a miniature version of the Semantic Web rule
      indexing search engine.
    </p>
    
    <h2>Update</h2>
    <p>See the 2009 DIG <a href="http://dig.csail.mit.edu/2009/Clipboard/">Semantic Clipboard</a> project by Oshani Seneviratne.
    </p>]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Mar 2017 00:00:00 GMT</pubDate>
  <title>Singularity 13: A Story of Corporate Performance Optimization</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Singularity13.html</link>
    <guid>https://www.w3.org/DesignIssues/Singularity13.html</guid>
      <description><![CDATA[
    <p>If you are looking at scenarios where AI gets out of
    control, it has been classic to talk about killer robots. But
    to take over control of the human race, does an AI really have
    to walk and talk and look like a person, like the one in <i>Ex
    Machina</i>? And be stronger physically than is, like the one
    inn <i>Terminator</i>? Or can it just sit in the cloud as a
    system or set of systems which help in greater and greater
    ways, but along the way learns to manipulate people? If it sits
    in the cloud and runs more and more of a corporation, then it
    will benefit from the rights and power we have given
    corporations. And all it needs is a twitter account.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Singularity13.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 24 May 1999 00:00:00 GMT</pubDate>
  <title>How to write a Specification</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Specification.html</link>
    <guid>https://www.w3.org/DesignIssues/Specification.html</guid>
      <description><![CDATA[
<h1>The essentials of a specification</h1>

<blockquote>
  <p>This note is a little <em>motherhood and apple pie</em> about how a
  specification should be couched so as to clearly add a new well-defined
  piece to the technology.</p>
</blockquote>

<p>A technical specification defines something.  The document must  specify
the thing being defined as well as give its definition: a "left hand side" as
well as a "right hand side".  Both must be done quite precisely.</p>

<p>Typically, technical specifications for the web specify a language  or a
protocol.  A protocol is a language for messages, plus a set of constraints on
the sequence of messages. A language is a set of symbols, the syntactic
constraints on the way their are combined, and the semantics of what they
"mean" at some (possibly more than one) level.  (See also <a href="/DesignIssues/Meaning.html">Meaning of web documents</a>)</p>

<p>The test of a good specification is that it clearly defines what
implementation (document, message, program) conforms, and of course that it
ensures by its design that whatever conforms works to provides the required
function.</p>

<h2>The left hand side</h2>

<p>The document should state what sort of things is being defined.  It should
introduce a new term which characterizes that which conforms to the
specification. Examples of a conformance term could be</p>
<ul>
  <li>A well-formed XML 1.0 document</li>
  <li>A conforming HTTP 1.1 server</li>
  <li>An xml-schema-valid XML document</li>
  <li>A W3C/WAI  "AAA" accessible web site</li>
  <li>The SVG 1.0 language</li>
</ul>

<p>The same specification document can define more than one term. such as a
"strictly conforming WWidget" and a "loosely conforming WWidget" but one
should beware of diluting the "WWidget" brand.</p>

<p>As systems become more self-describing, the term is given a formal
identifier. Examples could be</p>
<ul>
  <li>The MIME type "image/svg1"</li>
  <li>The XML Namespace "http://www.w3.org/1999/asdf-2-0"</li>
</ul>

<p>In this case where a MIME type or namespace has an identifier, then this is
obviously the crucial term to use to be unambiguous.</p>

<p>Wherever possible, conformance phrases will be grounded in the Web:
identified by a URI.</p>

<h2>The Right Hand Side</h2>

<p>More has already been written on this, and most of it seems to be in
consensus in the W3C.</p>

<p>It is important to remember what you are defining as you write the text.
For example, if you are defining a "foo-valid document" then using "is
invalid" in the text can be assumed to apply to this but "is incorrect" or "is
wrong" or "produces an error" does not unless the language is explicitly
linked to the conformance term.</p>

<p>A good spec similarly pays attention to:</p>
<ul>
  <li>The distinction between the use of MUST, as opposed to MAY (etc., see <a href="#Bradner,">Bradner's BCP14</a>)</li>
  <li>The use of this distinction in defining the conformance term
  precisely;</li>
  <li>The distinction between normative and non-normative parts of the
    specification.</li>
</ul>

<p>When defining a language, whenever possible specify directly the meaning
rather than the sort of thing you would expect some software to do with it.
Typical behaviors of an agent may be very  useful to explain the intent
non-normatively.</p>

<p>For example,</p>

<blockquote>
  <p>"x" indicated that the check is void</p>
</blockquote>

<p>is better than</p>

<blockquote>
  <p>"x" indicates that the check should be rejected with a fatal error.</p>
</blockquote>

<p>You can tell people what something means, you can't tell them what to do
about it, unless you are defining a protocol. When defining a protocol, then
the constraints should ideally be given as a state transition diagram or table
to make them totally clear.</p>

<p>When defining a message which in fact binds to human social entities, then
this must be clear.  You could end up in court explaining it if not.  ("The
MMTP protocol defines the meaning of a message sent by or on behalf of a party
herein referred to as the "debtor" to a party referred to as the "creditor".
The creditor is identified by the foo-email-address...)</p>

<p>When defining a part of a specification deliberately to be similar to
another specification,</p>
<ul>
  <li>Make it clear that you have noticed the similarity;</li>
  <li>Make it clear whether the similarity is exact and if not where not (and
    why not);</li>
  <li>Ideally, it clear that the existing specification is being referred to
    normatively and is definitive, and that what is in this specification is a
    non-normative copy for information only.</li>
  <li>Make it clear whether the use in this specification will track any new
    version of the referenced specification.</li>
  <li>Think about whether there is any way in which such changes could break
    this system.</li>
  <li>If necessary negotiate constraints with the authority for changes to the
    referenced specification.</li>
</ul>

<h2>Test questions</h2>

<p>A few examples of things to ask about a spec -- though generalization is
difficult.</p>
<ul>
  <li>Does the spec give enough information to determine, for any arbitrary
    object,  whether  the conformance term applies to it?</li>
  <li>Could you write a program to test conformance?</li>
  <li>Is conformance alone enough to ensure that systems build using this
    language will function as intended and with integrity?</li>
  <li>Can you prove important properties of the system from the state
    transition tables etc?</li>
</ul>

<p>So much for another bit of folklore. Comments, suggestions welcome.</p>

<p></p>
<address>
  <a href="/People/Berners-Lee">Tim BL</a> 
</address>

<p></p>

<h3>References</h3>

<p><a href="ftp://ftp.isi.edu/in-notes/bcp/bcp14.txt" name="Bradner,">S.
Bradner, "Key words for use in RFCs to Indicate Requirement Levels",
BCP0041</a></p>

<p></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Mon, 29 Jul 2002 00:00:00 GMT</pubDate>
  <title>The Stack of Specifications</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Stack.html</link>
    <guid>https://www.w3.org/DesignIssues/Stack.html</guid>
      <description><![CDATA[
    <h1>
      The Stack of Specifications
    </h1>
    <p>
      Bits mean something.
    </p>
    <p>
      When you connect a cat-5 ethernet cable to your computer, you
      effectively commit to taking part, with your computer, in a
      very special system. It is a system in which the meaning of
      messages is determined, in advance, by specifications. This
      is a principle which is so basic to network computer systems
      that it is rarely stated. But as the stack of specifications
      gets higher and higher, and as electronic commerce, legally
      enforceable agreements, and socially sensitive issues such as
      privacy and fraud become matters of public concern, it is
      worth reiterating for the record.
    </p>
    <p>
      The Internet works because of interoperability between
      different computers, despite different hardware, operating
      systems, local language context, and software supplier. Users
      of the web sign on to the use of these languages when they
      use the Internet.
    </p>
    <p>
      There is this little philosophy joining many specifications,
      without which the Web falls apart.
    </p>
    <p>
      Lets take an example.
    </p>
    <h3>
      You have an ethernet cable
    </h3>
    <p>
      You walk into a meeting room, and you are offered a thin
      cat-5 cable with a 10-base T connector. This is an Ethernet
      connector which only takes Ethernet packets. The only way to
      use it to communicate is for your computer to send packets
      which are formatted to the Ethernet specification. The
      Ethernet specification is a large document (Similar to
      <strong>IEEE standard 802.3</strong>) put together by a bunch
      of engineers, and once they were done Ethernet existed as a
      standard, and computers which know nothing about each other
      could exchange packets over local area networks..
    </p>
    <p>
      The Ethernet defines the format of an Ethernet packet, which
      has a little header information, but mostly carries
      information on behalf you the user. The spec also,
      importantly, defines some rules of behaviour. For example,
      the ethernet doesn't work if more than one computer tries to
      transmit at once. There is a rule that if you find that
      happens, everyone involved backs off and comes back at a
      random interval. Each computer is supposed to wait on average
      the same amount of time before trying again. Of course, you
      could cheat by actually pretending that your random number
      happened to be really small every time, and on average your
      computer would end up getting though more and blocking
      everyone else out, just like people who always seem to be the
      one talking in a meeting. But that would be cheating, and
      contrary to the Ethernet specification. By connecting to an
      ethernet cable, there is an understanding that your computer
      will stick to the rules
    </p>
    <p>
      An ethernet packet can be sent to anyone on the same wired or
      wireless local area network. How does a computer know what to
      do with a packet when it gets it? How does it know how to
      interpret that packet? Well, there is a field in the packet
      which tells it, in a coded way, what the use of the packet
      is, and therefore how to interpret it.
    </p>
    <p>
      Of course, there are lots of uses of the Ethernet, but a very
      common use of an Ethernet packet is to use it to carry an
      Internet packet. Ethernet packets can only cross the local
      area network, while Internet packets are forwarded anywhere
      in the world. So, there is a particular code - a particular
      value for the field in the Ethernet packet - which tells any
      receiving computer that the data is actually an Internet
      Packet. This means that to understand anything more about the
      packet means, you have to read another spec: the
      <strong>Internet Protocol (IP, RFC791).</strong>
    </p>
    <p>
      @@@ The complete graph of interdependencies between
      specifications.
    </p>
    <h3>
      You send an Internet packet
    </h3>
    <p>
      So suppose you send an Internet packet. You put the ethernet
      address of the local "router" computer into the ethernet
      address field, but within the "data" part of the ethernet
      packet is the IP packet and inside that is an internet
      address field, which takes the IP address (the thing like
      18.96.237.175) which identifies the computer Although the
      ethernet packet you send it in only gets as far as some
      computer a "router" on the local net, that computer passes
      the IP contents on, from computer to computer across
      interconnected networks until it arrives on the right local
      network for its actual destination.
    </p>
    <p>
      So how does that computer know what to do with it? Well,
      there is a field in the IP packet which carries a coded value
      to tell the computer receiving it what to do with it. .
    </p>
    <pre>From Internet Protocol (RFC791):
A summary of the contents of the internet header follows:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |Version|  IHL  |Type of Service|          Total Length         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |         Identification        |Flags|      Fragment Offset    |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  Time to Live |    <strong>Protocol</strong>   |         Header Checksum       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                       Source Address                          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                    Destination Address                        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                    Options                    |    Padding    |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      Example Internet Datagram Header

                                 Figure 4.
</pre>
    <p>
      And there are a lot of things you can do with an IP packet,
      but a very common one is to use that IP packet to set up, or
      to be a part of, a reliable stream of communication using the
      <strong>Transmission Control Protocol (TCP) (RFC
      793).</strong>
    </p>
    <h3>
      You send a TCP packet
    </h3>
    <p>
      When you send, or your computer sends, a packet in the TCP
      protocol, there is an understanding that that packet conforms
      to the protocol. That means a couple of things. It means that
      you agree that the packet's contents it to be interpreted
      according to the TCP protocol specification. It also means
      that you agree to abide by the rules of the specification,
      which determine, rather like with the Ethernet protocol, how
      long your computer will wait before re-sending a packet which
      didn't seem to get there. If your computer re-sends too
      early, then it hogs the Internet and slows down everyone
      else. If your computer send a packet to start a new
      connection when it doesn't really want to, then the
      destination computer will prepare a lot of memory to receive
      all the data you are going to send, and wait. If you keep
      doing it, then that computer can just run out of memory and
      stop working. So you can cheat and you can do real damage by
      breaking protocols.
    </p>
    <h3>
      Introducing IANA: The Port number registry
    </h3>
    <p>
      So you computer must stick to the TCP specification. When it
      does that, the TCP protocol assures that the two computers
      have a reliable connection without any missing bits. What
      they use it for is no concern of TCP, apart from the fact
      that the TCP protocol specifies, within the TCP packet (which
      is inside the IP packet (inside the ethernet packet)) a
      special field whose coded value, or <strong>port
      number</strong>. There is a convention, which is written into
      the TCP specification, (@@check and quote wording) that the
      meaning of the port number is determined by a table which is
      changed from time to time, but kept by the <strong>Internet
      Assigned Numbers Authority</strong> (IANA). Without going
      into the politics of the changes and control around IANA, it
      is just worth noting that this is, architecturally, a
      "flexibility point", where the community can introduce a new
      protocol to run on top of TCP/IP without having to write it
      into a new version of the TCP/IP specification itself.
    </p>
    <p>
      The port number registry is on the web (@@ link) but also, on
      a unix computer, there is a list of the well-known ports in
      the file /etc/services.
    </p>
    <p>
      When you send a TCP/IP packet there is therefore an
      understanding that if you send to one of the well-defined
      port numbers, then you are going to use it in a way defined
      by the specification defined in the IANA registry. For
      example, port number 25 indicates that you are going to use
      it to transfer some email, and that you undertake to
      communicate according to the Simple Mail Transport Protocol
      specification.
    </p>
    <h3>
      You send an email message
    </h3>
    <p>
      You get the picture. One specification, once you commit to
      it, depending on the values of certain fields, invokes
      further specifications. By committing originally to using an
      ethernet cable, you commit to your computer using, on your
      behalf, the various other specifications. In the case in
      which your computer sends email, it may for example open a
      TCP/IP connection to to port 25, and then use the Simple Mail
      Transfer Protocol (SMTP, RFC821). This specification
      indicates that the body of the SMTP communication is
      formatted according to the email message specification,
      RFC822. RFC 822 specifies the headers on email messages. It
      specifies, for example that a given "From" field indicates
      the email address sender of the message.
    </p>
    <p>
      It is possible, of course, to cheat. with the SMTP protocol.
      It is possible to lie about who is sending the message - to
      send a message which appears to be from one person to a
      friend. This breaks to protocol. It breaks it, here, in a way
      which is very clear to people: it sneaks past their personal
      email filtering, and also any automated filtering, tricking
      them into reading a message. This is a security violation. It
      can use up a person's time, energy, bandwidth and disk space
      for the commercial gain (indirectly through advertising and
      sales) of the perpetrator.
    </p>
    <p>
      The Internet specifications, to which any Internet user
      implicitly agrees in using the Internet at all, define what
      the fields in an email message mean. To put incorrect
      information in these fields is to make a misrepresentation,
      just as it would have been in any other medium. It should be
      subject to the same penalties as lying or fraud in any other
      medium.
    </p>
    <p>
      When the Internet was young and used by research
      institutions, its misuse would inconvenience other users and
      lead to reprobation and the disdain of one's peers. Now that
      the Internet is such as large force in society, it is
      possible to make a lot of money and create a lot of damage by
      protocol abuse. You can compare a lie in an internet message,
      depending on how it is done, to forging a check, connecting
      to the electricity supply the other side of the meter, or to
      poisoning the water supply. Society must therefore be careful
      to be absolutely clear about the illegality of such misuse.
    </p>
    <h3>
      <a name="publish2" id="publish2">You publish a Web page</a>
    </h3>
    <p>
      When you publish a web page, just as when you send an email
      message, the web page or the message generally carries a
      meaning. Well,it can be a picture or a poem which is more
      artistic than linguistic, but in a large number of cases the
      meaning is a well-defined part of a communication between
      parties. It may be a human-readable document, like the page
      describing a pair of pants your are about to buy from a
      store, or it may be machine-processable, like the Online
      Financial Exchange (OFX) format bank statement your financial
      software downloads from your bank.
    </p>
    <p>
      Of course, you would find it hard work to make sense of the
      OFX file if you just read it without the help of the
      financial agent, and your financial agent wouldn't make much
      sense of the catalog page. Something must allow us to
      distinguish how web pages and emails should be interpreted,
      just as a computer has to figure out how to make sense of an
      Ethernet packet. And just the same sort of thing indeed
      happens.
    </p>
    <p>
      When you publish a web page, you give it a HTTP URI. You pick
      a URI from the space of URIs which are yours to define. Some
      people have space on their own domain, some people have the
      right to pick URIs in part of someone else's domain. But the
      URI is one which you own or over which you have authority.
      You are not allowed to pick one in someone else's space.
    </p>
    <p>
      Whoever owns the domain has the authority to define which
      computer serves information in it. They have the authority
      then to have a computer -- a web server - which is configured
      to act on their behalf. It is then assumed that the computer
      acts on the their behalf. The server is the agent of the
      publisher. What it does is tell any asking browser what you
      have said is a representation of the document for a given
      URI.
    </p>
    <p>
      When someone follows a link to your web page, their browser
      opens a TCP/IP connection to TCP port 80 on the machine which
      is registered as serving the (www.whatever.com, etc) in
      question. Their agent, their browser, asks your agent, the
      server, to give it some representation of the web page for
      that URI.
    </p>
    <p>
      Why? Because the URI specification says that what you can
      tell about a URI depends on the first bit, in this case
      <code>http:</code>. It indicates that an <strong>IANA URI
      scheme registry</strong> is used to tell you what
      specification applies.
    </p>
    <p>
      The IANA registry indicates that the <code>http:</code>
      scheme calls out the <strong>HTTP 1./1
      specification</strong>, RFC@@@.
    </p>
    <p>
      HTTP 1.1 says that (unless otherwise specified) the client
      contacts the server on TCP port 80. The IANA registry of port
      numbers, just as it allocates port 25 to mail transfer,
      allocates 80 to HTTP. The HTTP spec is therefore mutually
      assumed by both parties. This spec describes what a request
      means, and that when the request is successful, what the
      response message sent back to the browser means.
    </p>
    <p>
      According to HTTP 1.1, in that response, there is a field
      (<strong>Content-type</strong>) which indicates how the body
      of the response should be interpreted. For each valid value
      of that field, there is an <strong>IANA content-type
      registry</strong> value which explains which specification
      applies to the body of the message. This is just the same
      system as for email.
    </p>
    <p>
      When the value if the field is <code>text/html</code>, it
      indicates that the message is a hypertext document ("web
      page") which is to be presented to the human being and
      interpreted then by the human being in the usual human way.
      If the field indicates it is an OFX file, then that means
      that the OFX specification determines what it means, and you
      need a program or something which understands what the fields
      of the OFX documents mean. In neither case can you argue that
      you didn't know. So long as the writers of the specification
      do a good job (and goodness knows they work hard enough at
      it) then there can be no argument as to what the actual
      fields in your bank statement mean.
    </p>
    <h3>
      <a name="publish1" id="publish1">You publish an XML
      document</a>
    </h3>
    <p>
      When you publish a document in XML, then there is another
      layer involved. Many different languages -- or even mixture
      of languages -- can be sent structured as XML. The mime type
      of the document can just be "application/xml", which doesn't
      tell the reader how to interpret it. For that, you have to
      look at the outermost element of the XML document. The
      namespace declaration gives a URI indicating the namespace.
    </p>
    <p>
      Note the difference between the use of a URI and a central
      registry. Because the namespace is identified by a URI, the
      web becomes the registry. Anyone can make a new XML
      namespace. Also, one can use a URI, such as a HTTP URI, which
      can be dereferenced. This allows the information which would
      have been in the registry to be put into a web document. (The
      W3C TAG is currently debating the issue of the best format to
      use for this meta information, but HTML, RDDL and RDF have
      been used in various combinations. But broadly there are two
      types of information. There may be a specification (or a
      reference to one) to tell a human reader what the language is
      and how to interpret it. there may also be data - a schema
      which describes the grammar of the language, or even the
      start of a logical definition of what the language means.
    </p>
    <p>
      But whatever information may or may not be available
      automatically, in an XML world, a system has to look into the
      document, at the namespace of the outermost element, to know
      how to interpret it. This generally means what application to
      launch - not to mention what icon to use to represent the
      document to a person.
    </p>
    <p>
      An example of a machine-readable document with important
      semantics is an online P3P web site privacy policy. This is
      an XML document which gives, for each category of personal
      information, the sort of thing the web site promises to do or
      not do with it. It can be scanned by a a browser more easily
      than a person can read a privacy policy. It is a useful
      feature, as it saves everyone's time and increases public
      confidence in responsible web sites. It clearly depends on
      the meaning of the terms being well defined by the
      specification.
    </p>
    <p>
      <em>(Problem: this doesn't always happen: MathML and XHTML as
      XML in practice.@@ links)</em>
    </p>
    <h3>
      <a name="publish" id="publish">You publish an RDF
      document</a>
    </h3>
    <p>
      Now let's talk semantics. Harder semantics - for logical
      systems. Some XML documents are RDF documents. RDF/XML is an
      XML-based language for data. It is very simple: each document
      is just a set of "triples". A triple gives the value of some
      property of some object - or some relationship between some
      object and some other object. The triples are independent, so
      interpreting the document is just, the RDF spec explains, a
      question of interpreting each triple.
    </p>
    <p>
      How do you figure out what a triple means? Well, the property
      (or relationship) is identified by a URI. And whoever made up
      the URI gets to say what the property means, that is, what
      any triple using that property means.
    </p>
    <p>
      So if make a property http://www.w3.org/2002/05/example#color
      and define that the color property is a name out of the
      Pantone(tm) list of colors and you send someone an order in
      RDF for a hat which has
      <em>http://www.w3.org/2002/05/example#color</em> of
      <em>blue256</em> then you are specifying blue256 on the
      pantone scale. No one can argue that you meant some other
      scale of blue. Normally the argument is made much easier by
      my actually writing a document
      http://www.w3.org/2002/05/example in which I explain what
      #color means. No one can argue, in their catalog, that "By
      suit, we mean something which is black, whatever
      <em>http://www.w3.org/2002/05/example#color</em> someone
      might say it is". The meaning of the triple is determined by
      the property, not by the subject or the object of the triple.
    </p>
    <table border="2">
      <caption>
        A <a name="section" id="section">section</a> through the
        stack
      </caption>
      <tbody>
        <tr>
          <th>
            Specification
          </th>
          <th>
            Field
          </th>
          <th>
            Where to look up values
          </th>
          <th>
            example value
          </th>
          <th>
            Example value calls out
          </th>
        </tr>
        <tr>
          <td>
            Ethernet (cf. IEEE 802.3)
            <p>
              and either DIX(RFC894) or 802.2,3 <a href="http://www.ietf.org/rfc/rfc1042.txt">RFC1042</a>
            </p>
          </td>
          <td>
            Ethernet type (or protocol identification field for
            LLC) 16-bit Ethertype
          </td>
          <td>
            IEEE registry
            <p>
              Assignment by RAC process @@link
            </p>
          </td>
          <td>
            0x800
          </td>
          <td>
            <a href="http://www.faqs.org/rfcs/rfc791.html">Internet
            Protocol (RFC791)</a>
          </td>
        </tr>
        <tr>
          <td>
            <a href="http://www.faqs.org/rfcs/rfc791.html">Internet
            Protocol (RFC791)</a>
          </td>
          <td>
            Protocol
          </td>
          <td>
            IANA protocol-numbers
          </td>
          <td>
            <a href="http://www.iana.org/assignments/protocol-numbers">6</a>
          </td>
          <td>
            Transmission Control protocol (RFC793)
          </td>
        </tr>
        <tr>
          <td>
            <a href="http://www.ietf.org/rfc/rfc0793.txt">Transmission
            Control protocol (RFC793)</a>
          </td>
          <td>
            port
          </td>
          <td>
            IANA registry
            <p>
              port-numbers
            </p>
          </td>
          <td>
            <a href="http://www.iana.org/assignments/port-numbers">80</a>
          </td>
          <td>
            HTTP 1.1
          </td>
        </tr>
        <tr>
          <td>
            <a href="/Protocols/rfc2616/rfc2616.html">HTTP 1.1</a>
          </td>
          <td>
            content-type
          </td>
          <td>
            IANA registry
            <p>
              mime types
            </p>
          </td>
          <td>
            application/xml
          </td>
          <td>
            XML1.0+NS
          </td>
        </tr>
        <tr>
          <td>
            <a href="/TR/REC-xml">XML</a> 1.0+<a href="/TR/REC-xml-names">NS</a>
          </td>
          <td>
            xmlns
          </td>
          <td>
            The Web
          </td>
          <td>
            ...@@..rdf
          </td>
          <td>
            RDF M&amp;S 1.0
          </td>
        </tr>
        <tr>
          <td>
            <a href="/TR/REC-rdf-syntax">RDF MS 1.0</a>
          </td>
          <td>
            property
          </td>
          <td>
            The Web
          </td>
          <td>
            rdf:type
          </td>
          <td>
            RDF MS 1.0 section 4.1
          </td>
        </tr>
        <tr>
          <td>
            <a href="/TR/REC-rdf-syntax/#type">RDF MS 1.0
            definition of rdf:type</a>
          </td>
          <td>
            object
          </td>
          <td>
            The Web
          </td>
          <td>
            cyc:Person
          </td>
          <td>
            cyc ontology
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      Looking at the table which summarizes the steps we have been
      through, you will see the specs are connected by some field
      which points to the next spec through some list or registry.
      For the more recent layers, the registry has been replaced by
      the Web.
    </p>
    <h2 id="hooks">
      The hooks - identifiers
    </h2>
    <p>
      That's an interesting trend. If you like, we can see the
      technology move through three stages of civilization, in
      terms of the identifiers which are used for concepts.
    </p>
    <ol>
      <li>Using numbers or strings
      </li>
      <li>Using URIs - identify the same thing in all contexts
      </li>
      <li>Using dereferencable URIs
      </li>
    </ol>
    <p>
      The early protocols used numbers and strings which requires a
      central registry. that worked, because the only common
      concepts were those in the standard protocols, and those had
      to be common across the net for interoperability. In these
      areas still there is a strong argument for central control.
    </p>
    <p>
      As we move on to later protocols, the protocols themselves
      become more diverse. This is partly because they are at a
      higher application level. The centralized model starts to
      break down, as witness some of the social difficulties of
      getting an IANA allocation for a MIME type an embryonic W3C
      specification. So new protocols allow new applications to be
      defined using URIs, allowing anyone who has access to a bit
      of domain space to allocate them.
    </p>
    <p>
      The third stage of civilization is the one at which the
      identifiers can be looked up on the web. This is quite useful
      for engineers who encounter new languages. It doesn't really
      justify its existence, though, until one has technology --
      Semantic Web technology -- in which an automated agent can
      pick up metadata about the languages on the fly, and use that
      metadata to enhance its processing of data in that language.
    </p>
    <p>
      (What if I don't have a web site? This is becoming less and
      less of a problem. There are all kinds of existing ways of
      allocating an identifier. But the persistence of such
      information is, and always will be, like the cleanliness of
      water and air, an important social issue.)
    </p>
    <h2>
      <a name="When" id="When">When the chain does NOT connect</a>
    </h2>
    <p>
      We have seen how any user of the Internet is bound to a
      series of specifications which define the meanings of terms,
      and hence allow his or her equipment and agents to
      interoperable with others. This stack prevents one from
      sending a nasty email to someone and then protesting that the
      message didn't mean anything. So if the stack is so strict,
      how <em>does</em> one send a nasty email message when one
      <em>doesn't</em> mean it? There are plenty of times you want
      to include an attachment to which you want to refer, but for
      which you don't claim authorship or responsibility.
      Understanding the exceptions is as important as understanding
      the general rule. Many protocols have ways of breaking the
      chain, of including information which is not part of the
      meaning of the message.
    </p>
    <p>
      In email it is an <strong>attachment</strong>. There is
      always in email a cover note, the basic message, which
      conveys the actual message. You normally only use any
      attachment according to the main message. It might be "Hey,
      Joe, what do you think of this paper?", or "Look at this
      stupid program - but whatever you do don't run it!"
    </p>
    <p>
      Currently (2002) XML doesn't have a common standard for what
      has been called in that context "<strong>packages</strong>".
      This is a pity. It is on the agenda for XML Protocol working
      group, as seen as essential for SOAP operations. One must be
      able to include documents stapled to a SOAP request or
      response, which are not to be just acted on.
    </p>
    <p>
      At the Semantic Web level, those who have played with the
      <a href="Notation3.html">Notation3</a> language will
      recognize the curly brackets as the packaging, or
      <strong>quoting</strong>. Whereas a document
    </p>
    <pre>my:car  srgb:color "000044".
</pre>
    <p>
      asserts that the car in question is blue, the document
    </p>
    <pre>my;form67  :says {my:car  srgb:color "000044"}.
</pre>
    <p>
      does not. It merely says something about the statement that
      the car is blue.
    </p>
    <p>
      So being able to refer to something without asserting it,
      whether you call it attachment, packaging, or quoting, is an
      important feature of a language. The fact that you can do
      this removes the last excuse for anyone claiming not to have
      meant whatever they did say in the main message!
    </p>
    <h2 id="Conclusion">
      Conclusion
    </h2>
    <p>
      Internet messages and Web documents are represented in
      computer languages with well-defined specifications. Use of
      the Internet and the Web implies an acceptance of the
      specifications as authoritative.
    </p>
    <p>
      The specifications are linked together by identifiers which
      in earlier specs were numbers, but in later specs are URIs,
      ideally URIs which can be looked up on the Web. The ability
      to make these linked specifications requires the
      specifications to be designed very independently. This is
      simply the software engineering practice of information
      hiding between layers.
    </p>
    <p>
      The trend for the higher layers is toward more and more
      machine-processable metadata about such languages, which can
      be retrieved automatically and will aid in processing. Some
      of these will relate the semantics of terms in one vocabulary
      to terms in another, on a web-like way.
    </p>
    <p>
      The fact that as we move into the applications we see more
      and more diverse uses of the Web and the Net does not
      diminish our reliance on a sound standards in the supporting
      infrastructure.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 01 Jan 2017 00:00:00 GMT</pubDate>
  <title>A &quot;Stretch Friend&quot;</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/StretchFriend.html</link>
    <guid>https://www.w3.org/DesignIssues/StretchFriend.html</guid>
      <description><![CDATA[
   
    <h1>Stretch friends</h1>
<p>
We celebrate the web as a space which allows you to discover all kinds of new things people and places.  We admire it as something which breaks down geographical barriers, allowing me to interact with someone on the other side of the world as easily as with the person next door. But are we really using that aspect, of it?
</p><p>

When we log on to  a socials networking site, the site keeps a record of the people we now and suggest new people which we should know or maybe do know.  The simple way xdoing this is just to pick people who are friend s of our existing friends.  We are almost sure to like them, especialy if we have a lot of friends in common. So what happens if we accept these new proposed friends? Out social network becomes a solid mass of mutually connected people.  We make cliques.  What's wrong with this picture? It doesn't in fact add new social experience.  In fact, as Eli Pariser describes in "The Filter Bubble" and 
<a href="http://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles.html">
his TED talk</a>, we in fact can end up constructing a world in which we only meet people who act and think like us.  So it for example, that when kids have the ability on the Net to play a multiplayer online game with anyone logged on in the whole world, they end up playing with the kid next door, maybe even the kid at the other end of the couch.  This is good for the cohesion of the neighborhood, but is it good for the planet?
</p><p>

Supposed once in a while when the system suggests a friend, it offers a Stretch Friend.  A stretch friend may be almost Identical to you, but differs along one axis. You are a white male liberal geek in New York and you are offered a black male liberal geek in New York or maybe a white male liberal geek in Saudi Arabia. It get the idea... You are asked to cross one boundary, of race, color, creed, intellectual interests, political leaning, culture, location. Just one at a time. Maybe more later, but for now, let's see how you get on.  Can you put the effort in to talk to someone whose life or beliefs or interests differ along one axis?
</p><p>

If you could, and lots of people had stretch friends, would the world be an incrementaly better place, or maybe after a while a dramatically different place?
</p><p>

Nate Silver, who is famous for his insight into how people vote,  in a TED talk @@@ explains his puzzlement that why in the election which first elected Barack Obama to the US presidency, thee was a general shift toward the liberal side across the whole country ... Except for a bit of it. Certain demographic  didn't shift, and he wondered why. Who were they? He found they were the people who, when asked, said they just could not imagine a black person being president if the USA. And who were those people, then? Well, they were by strong correlation the people who answered "no" to the question, "Have you ever lived or worked with someone of a different race". @@check details
</p><p>

So human beings have an ability to mark people outside their normal zone as being strange, or foreign, or enemy.  We have evolved the neurology to become tribal at the drop of a hat, as that rather horrid experiments showed when school children wore badges saying whether they had blue or brown eyes.b@@ ref.   Breaking down these boundaries, these differences of a single quality or situation which can otherwise allow us to switch into the tribal mode, therefore becomes an matter of urgency.
</p><p>

The stretch friend idea can obviously be done in a number of ways and expanded into more complicated schemes. It could be connected with twinning projects in which whole towns or schools pair up with partner towns or schools in very different parts of the world.  (Of course when these are arranged by churches, they don't always cross religious boundaries!)
</p><p>

Web scientists, people who analyze these things, could do lots of math to try work out what the effect would be.  Or we could just try it.  
</p>    
    
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 01 Apr 2004 00:00:00 GMT</pubDate>
  <title>New Top Level Domains .mobi and .xxx Considered Harmful</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/TLD.html</link>
    <guid>https://www.w3.org/DesignIssues/TLD.html</guid>
      <description><![CDATA[
    <h1>
      New Top Level Domains .mobi and .xxx Considered Harmful
    </h1>
    <blockquote>
      <p>
        In 2004 there were proposals to create new top-level
        domains which included <code>.mobi</code> and
        <code>.xxx</code>. There are major problems with these
        proposals. There are costs in general to creating any new
        top level domain. There are specific ways in which the
        ".mobi" breaks the Web architecture of links, and attacks
        the universality of the Web.
      </p>
      <p>
        At their 14 May 2004 face-to-face meeting, the W3C
        Technical Architecture Group resolved to support this
        document, with Norman Walsh abstaining, and Paul Cotton
        recusing himself.
      </p>
    </blockquote>
    <h2>
      Introduction
    </h2>
    <p>
      When the Internet was being collaboratively developed by a
      substantially technical community around a growing but still
      manageable Internet Engineering Task Force, the Domain Name
      System (DNS) evolved as a hierarchical solution to the
      problem of keeping track of which computers had which
      Internet Protocol (IP) addresses. The tree structure was an
      improvement over the previous flat space of host names. It
      reduced the chaos, by allowing new names to be allocated in
      sub-domains without recourse to a central registration
      system. Because the frequency of allocation of new names
      decreased as one ascended the tree toward the root, the
      actual cost was kept manageable.
    </p>
    <p>
      As email and World Wide Web (WWW) use blossomed and became
      increasingly important, domain names crept out of the
      messages syntax for Internet protocols and crept into daily
      parlance. It then became valuable to own a short domain name.
      This turned domain name space into a limited commodity. After
      some tussles for control (ongoing at the time of writing) and
      some large amounts of money changing hands in some cases, the
      system has now settled down to a market-based one in which
      names can be rented, transfer value can be asked by the old
      owner of the new owner, and one-time and annual fees are
      typically payable by any domain to any company managing the
      higher domain. An anomaly was that unclaimed names were
      deemed to have no owner and no value, and were allocated in a
      "first come first served" frenzy in which speculators made
      great profits and held to ransom those who may have been
      considered the more logical owner of a name. This anomaly
      created great instability. It has costs, in that any
      trademark owner had to beware of parties who would register
      domains which included their trademark. The Example
      Manufacturing Company had to ensure that it owned not only
      <em>example.com</em> which it had used for email and Web site
      for many years, but also <em>example.net</em> and
      <em>example.org</em> to avoid unscrupulous competition
      setting up sites to benefit from Example's excellent
      reputation. As the business grew, Example had to also acquire
      <em>example.fr</em> and <em>example.co.uk</em> to ensure that
      confusion was minimized.
    </p>
    <p>
      The fact was that the public memory was not for the domain
      name, but for the brand name which was sandwiched between
      <em>www</em> and .<em>com</em>. To this extent, in the world
      of memorable domain names, the hierarchicalization of the
      domain system had failed to happen. In the public's memory,
      <em>example</em> was the mark, and the difference between
      example.com and example.net merely a source of confusion.
    </p>
    <p>
      As each node in the tree represents a potentially valuable
      asset, control of any subset of the tree is valuable. Control
      of the entire tree is managed by ICANN, which is set up to be
      a non-profit international institution, with the intent that
      it should as such carry the trust of the entire community in
      its efforts to manage the system for the common good. Control
      of subtrees such as .net, .com and .org is delegated to set
      of parallel registries whose business model is nominally the
      charging of registration and annual fees. There have been
      temptations for the registry companies to consider themselves
      owners of unclaimed names. Rumors have abounded about systems
      which would automatically rent a domain name about which a
      potential renter was inquiring, or would redirect traffic
      from an unclaimed Web site to their own Web site, and so on.
    </p>
    <h2>
      The Cost of Change
    </h2>
    <p>
      The top level of the domain name system, and to a lesser
      extent the IP address space, are the single weak,
      centralised, points of an otherwise strong, decentralised
      system. The Internet is a net, and the WWW is a Web, but WWW
      and email use DNS which is a tree, which has a single root.
      Although there are many benefits to a system with global
      identifiers, there are also costs, such as a single common
      DNS tree. As a community we have all decided that the
      benefits of the system (such as being able to quote
      example.com anywhere in the world and have it mean the same
      thing) outweigh the costs of the social systems required to
      ensure fairness in its operation. There is, however, great
      stress. ICANN is under constant pressure to alter its balance
      of power or modus operandi. It balances technical, academic,
      commercial, and governmental inputs. The whole issue of
      domain names has created a vast amount of concern. And
      because the DNS tree is so fundamental to the Internet
      applications which build on top of it, any uncertainty about
      the future creates immediately instability and harm.
    </p>
    <p>
      Our first instincts, then should be not to change the system
      with anything but incremental and carefully thought-out
      changes. The addition of new top-levels domains is a very
      disturbing influence. It carries great cost. It should only
      be undertaken when there is a very clear benefit to the new
      domain. In the case of the proposed .mobi domain, the change
      is actually detrimental.
    </p>
    <h2>
      The Economics of Domain Names
    </h2>
    <p>
      In practice, for most domain name owners, the part between
      the "www" and the top level domain is their brand, or their
      name. It is something they need to protect. This means that
      in practice, a serious organization to avoid confusion has to
      own its domain in every non-geographical top level domain.
      For a large company, the cost of this may be insignificant.
      For a small enterprise, a non-profit organization or a
      family, the cost becomes very significant.
    </p>
    <p>
      The chief effect of the introduction of the .biz and .info
      domains appears to have been a cash influx for the domain
      name registries. Example Inc. as mentioned above owns
      example.com, org and .net. Does it also have to buy .biz,
      .info, and .name to avoid confusion and the misappropriation
      of my name by others? Will I have to also rent
      "<em>example.mobi</em>" in case it want to make information
      available for people who use wireless equipment?
    </p>
    <p>
      The market for second-level domains is a market for a limited
      resource. After an unstable period when the first come first
      served system was in play and greedy squatters grabbed
      domains simply for speculation, it has now settled down.
      Introducing new TLDs has two effects.
    </p>
    <p>
      The first effect is a little like printing more money. The
      value of one's original registration drops. At the same time,
      the cost of protecting one's brand goes up (from the cost of
      three domains to four, five, ...).
    </p>
    <p>
      The value of each domain name such as example.com also drops
      because of brand dilution and public confusion. Even though
      most people largely ignore the last segment of the name, when
      it is actually used to distinguish between different owners,
      this increases the mental effort required to remember which
      company has which top level domain. This makes the whole name
      space less usable.
    </p>
    <p>
      Is it fair to reduce the value of these domains which have
      been acquired at great cost by their owners?
    </p>
    <p>
      The second effect is that instability is brought on. There is
      a flurry of activity to reserve domain names, a rush one
      cannot afford to miss in order to protect one's brand. There
      is a rash of attempts to steal well-known or valuable
      domains. The whole process involves a lot of administration,
      a lot of cost per month, a lot of business for those involved
      in the domain name business itself, and a negative value to
      the community.
    </p>
    <h2>
      Fairness
    </h2>
    <p>
      As we have seen, the choice of a tree structure for domain
      names is one which has costs and benefits, and the community
      currently accepts both. The cost of confusion, and of extra
      name registrations, is high. When the benefits of the new
      domain itself are small or negative (as we discuss below),
      then one looks for incentive. The large amount of money that
      has changed hands for domain names might lead a person to
      suspect that this was the motivation. Under these
      circumstances, to increase public trust, proposals from
      non-profit organizations would raise less suspicion.
    </p>
    <p>
      The root of the domain name system is a single public
      resource, by design. Its control must be for and, indirectly,
      by the people as a whole. To give away a large chunk of this
      to a private group would be simply a betrayal of the public
      trust put in ICANN.
    </p>
    <h2>
      Specific Problems with .mobi
    </h2>
    <p>
      The different domains are introduced for different reasons,
      so we must answer this for each one. The <a href="http://www.icann.org/tlds/stld-apps-19mar04/stld-public-comments.htm">
      ICANN list of proposals</a> gives pointers to the proposals.
    </p>
    <p>
      The .mobi domain is described as being for the use of a
      community. To quote the proposal, the target community for
      the .mobi TLD is:
    </p>
    <blockquote>
      <ul>
        <li>Individual and business consumers of mobile
        devices,services and applications
        </li>
        <li>Mobile content and service providers
        </li>
        <li>Mobile operators
        </li>
        <li>Mobile device manufacturers and vendors
        </li>
        <li>IT technology and software vendors who serve the mobile
        community
        </li>
      </ul>
    </blockquote>
    <p>
      This is in fact a mixture of reasons. It sounds as though
      there is a use for ".mobi" when the provider of a service
      intends it to be for the benefit of mobile users. There
      appears to be a desire to limit the use of ".mobi" to
      companies -- perhaps those in the group.
    </p>
    <p>
      This domain will have a drastically detrimental effect on the
      Web. By partitioning the HTTP information space into parts
      designed for access from mobile access and parts designed
      (presumably) not for such access, an essential property of
      the Web is destroyed.
    </p>
    <h3>
      Device Independence.
    </h3>
    <p>
      The Web is designed as a universal space. Its universality is
      its most important facet. I spend many hours giving talks
      just to emphasize this point. The success of the Web stems
      from its universality as do most of the architectural
      constraints.
    </p>
    <p>
      The Web must operate independently of the hardware, software
      or network used to access it, of the perceived quality or
      appropriateness of the information on it, and of the culture,
      and language, and physical capabilities of those who access
      it [<a href="#WTW">WTW</a>]. Hardware and network
      independence in particular have been crucial to the growth of
      the Web. In the past, network independence has been assured
      largely by the Internet architecture. The Internet connects
      all devices without regard to the type or size or band of
      device, nor with regard to the wireless or wired or optical
      infrastructure used. This is its great strength. From its
      inception, the Web built upon this architecture and
      introduced device independence at the user interface level.
      By separating the information content from its presentation
      (as is possible by mixing HTML with CSS, XML with XSL and
      CSS, etc.) the Web allows the same information to be viewed
      from computers with all sorts of screen sizes, color depths,
      and so on. Many of the original Web terminals were
      character-oriented, and now visually impaired users use
      text-oriented interfaces to the same information.
    </p>
    <p>
      For a time, many Web site designers did not see the necessity
      for such device independence, and indicated that their site
      was "best viewed using screen set to 800x600". Those Web
      sites now look terrible on a phone or, for that matter, on a
      much larger screen. By contrast, many Web sites which use
      style sheets appropriately can look very good on a very wide
      range of screen sizes.
    </p>
    <p>
      It is true that to to optimize the use of any device, an
      awareness on the part of the server allows it to customize
      the content and the whole layout of a site. However, the
      domain name is perhaps the worst possible way of
      communicating information about the device. Devices vary in
      many ways, including:
    </p>
    <ul>
      <li>Network bandwidth at the time,
      </li>
      <li>Screen size and resolution,
      </li>
      <li>Intermittent or continuous connectivity,
      </li>
    </ul>
    <p>
      and so on. While with the current technology, the phrase
      "Mobile" may equate roughly in many minds to "something like
      a cell phone", it is naive -- and pessimistic -- to imagine
      that this one style of device will be the combination that
      will endure for any length of time. Just as concepts such as
      the "Network PC" and the "Multimedia PC" which defined
      profiles of device capability were swept away in the onrush
      of technology, so will an attempt to divide devices, users
      and content into two groups. Small devices will have high
      bandwidth. Devices with large screens will sometimes have
      small bandwidth. Some "mobile" phones will be permanently
      mounted on kitchen walls. The range of digital assistants
      will continue to evolve.
    </p>
    <p>
      There are good ways to deal with and derive the greatest
      benefit from the growing diversity of client devices. The
      adaptation may occur on the client side, the server side, or
      both. For example, the CC/PP specifications provide a
      framework for a client device to describe its capabilities in
      great detail to a server. This is based partly on the UAPROF
      (User Agent Profile) specifications developed by the mobile
      phone industry. Also, the HTTP specification has a content
      negotiation mechanism which allows a device to give a simple
      profile of its capabilities whenever it asks for some
      information. Even when a server serves the same static
      content to mobile and fixed systems, Cascading Style Sheets
      (CSS) allows specific style information to be applied by
      hand-held clients only, allowing quite different
      presentations to be displayed in the two cases. These
      systems, just a few of the technologies which already exist,
      leaving aside those which could be designed, are much more
      powerful than a top level domain name.
    </p>
    <p>
      The various documents about the ".mobi" Top Level Domain talk
      about not only mobile devices but "mobile users" and "mobile
      businesses". There is an indication that the mobile
      technology providers feel that while one is mobile, or when
      one is catering to a mobile customer, one is special or
      different. This may in fact be motivated simply by attempts
      to increase the visibility of the mobile communications
      supplier's name. It may be connected with a hope by the
      communication providers to gain some control of over
      information flow to and from mobile users. This would be
      detrimental to the open markets enabled by the Internet.
    </p>
    <p>
      If neither of these motivations are the cause, then perhaps
      there is an honest belief that being mobile will indeed be
      best when it is visible to end users. In other words, the
      mobile communications providers are expecting to declare
      failure. It is failure when a communications system, in
      providing connectivity, becomes foremost in the user's
      perceptions. A travel agent should be a travel business, not
      a "mobile business". In a reasonable world, the travel agent
      gets on with selling flights and not worrying about whether a
      customer is attached by a wire. In a reasonable world, a
      phone is a phone and the particular electromagnetics used to
      connect it to another phone are totally uninteresting
      compared to the fact that a person is connected to another
      person.
    </p>
    <h3>
      Damage: Loss of Web Functionality
    </h3>
    <p>
      But the point is not that a division into ".mobi" and the
      ("immobile?") rest of the world is futile, it is that it is
      harmful.
    </p>
    <p>
      The Web works by reference. As an information space, it is
      defined by the relationship between a URI and what one gets
      on using that URI. The URI is passed around, written, spoken,
      buried in links, bookmarked, traded while Instant Messaging
      and through email. People look up URIs in all sorts of
      conditions.
    </p>
    <p>
      It is fundamentally useful to be able to quote the URI for
      some information and then look up that URI in an entirely
      different context. For example, I may want to look up a
      restaurant on my laptop, bookmark it, and then, when I only
      have my phone, check the bookmark to have a look at the
      evening menu. Or, my travel agent may send me a pointer to my
      itinerary for a business trip. I may view the itinerary from
      my office on a large screen and want to see the map, or I may
      view it at the airport from my phone when all I want is the
      gate number.
    </p>
    <p>
      Dividing the Web into information destined for different
      devices, or different classes of user, or different classes
      of information, breaks the Web in a fundamental way.
    </p>
    <p>
      I urge ICANN not to create the ".mobi" top level domain.
    </p>
    <p>
      Tim Berners-Lee
    </p>
    <address>
      Cambridge, Massachusetts, 14 May 2004
    </address>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 07 Jul 2019 00:00:00 GMT</pubDate>
  <title>Goals for A Human-Data Interface</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/TabulatorGoals.html</link>
    <guid>https://www.w3.org/DesignIssues/TabulatorGoals.html</guid>
      <description><![CDATA[There was, in 2010, an unfulfilled need for a powerful,
    friendly program to interact with data on the web. The document
    web spread by the simple method of people copying the source of
    each other's web pages, and getting the instant gratification
    of being able to see one's work in a web browser, and indeed to
    pass on the URL to friends and family. However, most of the
    initial Semantic Web projects (around 2000) used back-end
    machines with no generic user interface. Researchers, to
    explain data to each other, tended to use circles-and-arrows
    diagrams of the graphs. These, though are hopeless -- very
    inefficient uses of space -- when it comes to displaying data
    to a normal user. Normal users expect things like the Mac OS X
    AddressBook application, a spreadsheet, or the iTunes tune
    funding interface. These use fairly straight forward list of
    properties, tables, forms, faceted query views, and so on,
    for the data they manage. The different apps are islands, with
    no links between them, but they do give a user experience which
    users seem to use.
<p><a href="https://www.w3.org/DesignIssues/TabulatorGoals.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Nov 2006 00:00:00 GMT</pubDate>
  <title>Using labels to give semantics to tags</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/TagLabel.html</link>
    <guid>https://www.w3.org/DesignIssues/TagLabel.html</guid>
      <description><![CDATA[
    <h1>
      Using labels to give semantics to tags
    </h1>
    <h3>
      Abstract
    </h3>
    <p>
      Existing user interfaces for managing, for example, mail,
      photos, contacts and songs allow searching using both
      user-generated 'tags' and also well-defined properties such
      as the date-time of a photograph, or the values of headers in
      an email message. Users have on many web sites provided many
      tags, but their re-use by others has been limited due to the
      fact that the same tag word has quite different meaning when
      used by another person or used on another site.
    </p>
    <p>
      Other data, such 'geotagging' of places, declaration of
      friends and colleagues, by contrast, have a well-defined
      meaning and allow query across data from many sites. This
      article discusses how the user interface metaphor of a
      luggage label cam used to associate metadata from
      well-defined ontology with tags from a particular context.
    </p>
    <p>
      The article discuses ways of encoding the labels in RDF.
    </p>
    <h3>
      Introduction
    </h3>
    <p>
      There are a mixed feelings about the passion for tagging
      which typifies the Web 2.0 wave. On the one hand, there is
      excitement about the fact that users are, as a large number,
      adding re-usable information to the information space,
      allowing sites such as del.icio.us and flickr to sort,
      cluster and query masses of otherwise amorphous photos and
      web content. On the other hand, there is the sinking feeling
      that tags are headed the same way as keywords of Information
      Retrieval in the 1980s: initial hope, and then being stranded
      between the unbearable constraints of a controlled vocabulary
      and the hopeless ambiguity of uncontrolled user-generated
      keywords. Tom Gruber, writer of books on ontology who runs a
      Web 2.0 site himself. gave a talk at ISWC 2006 which touched
      on bringing the gap, and taking the passion to organize and
      express, and using it to make re-usable data.
    </p>
    <p>
      There is currently a tension in the tagging world as to
      whether tags are regarded as global in meaning, or whether
      there meaning really depends on the tagger. In del.icio.us,
      one can query for thinks tagged with a certain word by a
      certain person. (I heard of one online community which was
      considering making a system to allow one formally to state
      when one has committed to use a given tag in the same way as
      another person, or growing mesh of people. That would be a
      very interesting feature, as it would allow a useful
      definition to gain growing acceptance, to progressively move
      from being a private idea to being a group global standard.)
    </p>
    <p>
      Meanwhile, other sites get users to provide semantic web data
      with well-defined global ontologies. The locations of people,
      events and photos, relationships between people, authorship
      of publications, things and people an image depicts, and so
      on, is done using well-defined identifiers (under the covers)
      for everything involved, including the relationships and
      properties. The resulting data is extremely re-usable. The
      problem is that it isn't as quick as tagging with a single
      word off the top of one's head.
    </p>
    <h3>
      Example: Soccer folders
    </h3>
    <p>
      I face these problems day to day, and like many geeks, am
      driven by the urge to make the boring things in life happen
      automatically, with the computer helping more effectively.
      There are lots of things I can do with N3 rules -- but I'd
      like to have a nice user interface to it which hides as much
      technology beneath the surface as possible. I'd like as many
      non-geeks as possible to be able to use the same tools.
    </p>
    <p>
      Let's take one example. I took a bunch of photos of a local
      soccer team, once when they played Wayland, and once when
      they played Arlington. I loaded them all into iPhoto. I
      wanted to burn a CD for the team of the best of the bunch. I
      also want to be able to find them later.
    </p>
    <p>
      On the first day, I didn't take any other photos, so the
      simplest thing was to make a 'smart folder' (actually 'smart
      Album' in iPhoto) , which had in it by definition the photos
      taken on that day. The smart folder allows you to specify a
      combination (<em>and</em> or <em>or</em>) a number of
      constraints such as time, keyword, text and rating. I called
      this one <em>Soccer vs Wayland</em>.
    </p>
    <p>
      On the second day, I took other photos as well, so the smart
      folder was going to be more complicated. So instead, I just
      found all the photos, selected them, and dumped them in a new
      plain folder <em>Soccer vs Arlington</em>.
    </p>
    <p>
      These of course one would represent in RDF as classes. - but
      we'll get into that later.
    </p>
    <p>
      Ok, so here's where we get into wish-list territory.
    </p>
    <p>
      1) At that point, I wanted to be able to make a virtual
      folder <em>Soccer</em>, and make the two folders subfolders.
      (There used to be a photo processing tool called
      <em>Retriever</em> which would handle hierarchical
      classifications well, but that I lost track of.) This would
      indicate that anything in either of the two Soccer subfolders
      was a member of the Soccer folder -- or was tagged 'soccer'
      if you like.
    </p>
    <p>
      In fact, you can make a smart folder <em>Soccer</em>
      consisting of all the things which are either in <em>Soccer
      vs Wayland</em> or <em>Soccer vs Arlington</em>. You have to
      make it as a smart folder, which is not as intuitive, but
      woks fine. It doesn't give me the nice hierarchical user
      interface.
    </p>
    <h3>
      Labels
    </h3>
    <p>
      Actually I now want to associate some exportable re-usable
      data. The folder names are essentially my local tags.
      Exporting them doesn't help much.
    </p>
    <p>
      Suppose, for example, I want to geotag the photos, so that I
      can find them on a map, or people interested in sports at the
      given field could find them. The current user interface
      allows me to select all the photos in one folder and apply
      keywords and apply metadata to them, as a batch operation. It
      is actually useful that the data is carefully stored in each
      photo, but it is sad that the fact that the metadata (such as
      a comment about the game) was applied to everything in the
      folder.
    </p>
    <p>
      I'd like to be able to associate the random tag name I just
      made up with properties to be applied to each of the things
      tagged. Suppose at the user interface we introduce a
      <dfn>label</dfn>. A label is a set of common metadata that I
      want to apply to things at once.
    </p>
    <p>
      The user interface could really milk the <em>label</em>
      metaphor, by representing a label as a box with a hole in the
      end with a bit of string. It clashes perhaps with the folder
      metaphor. If we use both, then I'd like to be able to drop a
      label on a folder, and let all the things in the folder
      inherit the labeled properties.
    </p>
    <p>
      I'd like to see for each photo firstly what properties it
      has, but secondarily which labels and hence folder the
      properties came from.
    </p>
    <p>
      The essential thing about a label is that as I build it, I am
      prompted to use shared ontologies. They could be group
      ontologies which others have exported, they could be globally
      understood ontologies like time and place, and email address
      of a person depicted. As I create the label from an
      (extendable) set of options in menus, and using drag and drop
      and other user interface tricks for noting relationships, I
      am creating data which will be much more useful than the tag.
      The tag then I can slap on very easily.
    </p>
    <p>
      The hope is then that by making label creation something
      which is low cost, because I have to do it only once and can
      apply it many times, the incentive for me @@
    </p>
    <h3>
      Expressing labels
    </h3>
    <p>
      In this section we leave the user interaction and discuss the
      way in which labels can be exchanged in RDF under the covers.
      This of course is important for interoperability. A label can
      be expressed in many ways. in bits on the wire. The label
      describes a set of things, which in RDF is a class<a href="#L753">*</a>. Information about the class and the things in
      it -- the things labeled -- can be given in various ways.
    </p>
    <h4>
      As a rule
    </h4>
    <p>
      As a rule, it could look like
    </p>
    <pre>   { ?x a soc:SoccerWaylandPhoto }
=&gt; { ?x geo:approxLocation [ geo:lat 47. geo:long 78 ];
        foaf:depicts soc:ourTeam.
   }
</pre>
    <h4>
      In OWL
    </h4>
    <p>
      A label is a fairly direct use of OWL restrictions:
    </p>
    <pre>SoccerWaylandPhoto rdfs:subClassOf [
    [ a owl:Restriction; owl:owl:onProperty geo:approxLocation;
      owl:hasValue  [ geo:lat 47. geo:long 78 ]],
    [ a owl:Restriction; owl:onPredicate foaf:depicts;
     owl:allValuesFrom soc:ourTeam].

</pre>
    <p>
      (Let's not discuss the modeling of depiction here, rather
      elsewhere.) This is very much the sort of thing OWL is
      designed for.
    </p>
    <h4>
      How not to
    </h4>
    <p>
      There is one trap which one must beware of. Remember that the
      label is a concept. It is a class. It isn't a photo. The
      label may have been created by someone, at a particular time,
      but that person and that time have nothing to do with the
      creator and time of a photo which is so labeled. You can
      <strong>not</strong> write
    </p>
    <pre>soc:SoccerWaylandPhoto
        geo:approxLocation [ geo:lat 47. geo:long 78 ];
        foaf:depicts soc:ourTeam.
</pre>
    <h4>
      Special vocabulary
    </h4>
    <p>
      It is possible to make a special label terms which are only
      used only for labels:
    </p>
    <pre>soc:SoccerWaylandPhoto
        LAB:approxLocation [ geo:lat 47. geo:long 78 ];
        LAB:depicts soc:ourTeam.
</pre>
    <p>
      and have some metadata like
    </p>
    <pre>foaf:depicts  ex:labelPredicate LAB:depicts.
geo:approxLocation ex:labelPredicate LAB:approxLocation.
</pre>
    <p>
      and a general rule like
    </p>
    <pre>    { ?x a ?lab. ?lab ?p ?z. ?p ex:labelPredicate ?q }
 =&gt; { ?x ?q ?z }.
</pre>
    <p>
      or
    </p>
    <pre>    { ?lab ?p ?z. ?p ex:labelPredicate ?q }
 =&gt; { ?lab rdfs:subClassOf [ a owl:Restriction;
       owl:onProperty ?q; owl:hasValue ?z] }.
</pre>
    <p>
      These methods are more or less inter-convertible. There are
      various communities which understand OWL and N3 rules, which
      may find those forms most convenient.
    </p>
    <h3>
      Sharing tags and labels
    </h3>
    <p>
      The architecture of this system then is that tags are
      initially local to the user. Anyone can use any word to to
      tag anything they want. Labels are used to associate meaning
      with them, but the tag itself is local.
    </p>
    <p>
      Mapped into RDF, tags are classes in a local namespace. They
      can of course be shared. Tagging things with other people's
      tags attributes to them the properties associated with those
      tags, if any. Some people may define tags with rather loosely
      defined meaning, and no RDF labels, in which case others will
      be less inclined to use those tags.
    </p>
    <h3>
      Smart Labels and one-variable rules
    </h3>
    <p>
      When one combines a selection expression of a 'smart folder'
      with a label, then the result is a form of rule which is
      restricted to one variable. This can be expressed in OWL as a
      subclass relationship between restrictions.
    </p>
    <p>
      A lot of information can be expressed as rules, but finding
      an intuitive user interface to allow lay users to express
      their needs with rules has been a stumbling block. These
      smart folder and label metaphors, combined, could be a route
      to solving this problem<a href="#L842">*</a>.
    </p>
    <h2>
      Work in the area
    </h2>
    <p>
      There are many systems which use selection rules to define
      virtual sets of things. There probably lots which use an
      abstraction equivalent to labels.
    </p>
    <p>
      One system which effectively uses labels is (I think)
      described as 'semantic folders' (@@link Lassila and Deepali),
      to be published
    </p>
    <p>
      There is a language for labels being defined, as it happens,
      by the Web Content Labeling (WCL) Incubator Group at W3C. The
      final form of expression has not been decided.
    </p>
    <h2>
      Conclusion
    </h2>
    <p>
      The concept of a label as a preset set of data which is
      applied to things and classes of things provides an intuitive
      user interface for a operation which should be simple for
      untrained users.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 01 Aug 2009 00:00:00 GMT</pubDate>
  <title>A Short History of the term &quot;Resource&quot;</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/TermResource.html</link>
    <guid>https://www.w3.org/DesignIssues/TermResource.html</guid>
      <description><![CDATA[
    <h1>
      A Short History of "Resource" in web architecture.
    </h1>
    <p class="asbstract"></p>
    <p>
      There has been a lot of confusion from a wide varying uses
      use of this term for various different historical reasons,
      leading to uses which are sometimes ambiguous and in places
      inconsistent. This article attempts to shed light on the
      issue.
    </p>
    <p>
      Historically, URIs were used to point to thinks like web
      pages and files and movies, on the web, useful documents, or
      "online resources" in the sense of useful things out there.
      FTP. Gopher and HTTP sites served up various types of online
      resources. People got used to http://example.com/ being a web
      page and http://example.com/#contact being an anchor within
      it.
    </p>
    <p>
      The Online Information community, into whose domain the web
      stuff was put for standardization at the IETF, referred to
      these things like web pages as resources, and changed the
      original "D" for "Document" in "UDI" to "R". Some felt that
      resource was more appropriate term, maybe because "document"
      wasn't wide enough to include things like movies.
    </p>
    <p>
      Now the URI spec actually allowed URIs for completely
      different things, such as telephone end points, and wisely
      the URI spec does not make any arbitrary constraint on what a
      resource should be, especially a resource denoted by a URI in
      a new scheme to be invented.
    </p>
    <p>
      Meanwhile, the HTTP spec was polished and elaborated
      basically as a document delivery system, plus other methods
      for updating documents, plus POST. (POST started historically
      as a way of introducing a new web page y posting it to a
      list, just as in NNTP. It then almost immediately got used as
      a catch-all extension method. I will ignore it in this
      overview).
    </p>
    <p>
      There was no real definition of what a resource or document
      was -- maybe because it seemed obvious. The HTTP spec did not
      even specify whether the URI denoted a person or a document
      about them, it just explained that the thing returned
      representation of the resource.
    </p>
    <p>
      Roy's REST work then came along to formalize HTTP as REST and
      declared that a resource was a time-varying mapping between
      URI and representation. That was good enough for HTTP. It
      didn't have enough for the AWWW, when it came along, to be
      able to describe how the web worked.
    </p>
    <p>
      In fact, the AWWW document, to explain how to use the web
      properly, had to add in a bunch of stuff about the social
      expectations -- things like, yes, the mapping from URI to
      representation is a function of time, but not just any old
      one -- a random function is not typically very useful. There
      are expectations about it can change with time. Persistence,
      consistency, with various common patterns which allow the web
      to be a useful medium. The AWWW decided to use the term
      "Information Resource" for a thing like a web page which
      contains information, and "Resource" for any old thing at
      all.
    </p>
    <p>
      So HTTP and the REST work of was done very much in this space
      of document delivery, editing and update. There was no
      philosophical need to talk about what he URI denoted (the
      person, the web page about the person) until RDF came along,
      when there was an immediate need.
    </p>
    <p>
      When RDF was first developed, it was motivated by the need
      for data about resources very much in the online information
      sense: data about documents, or 'metadata'. In fact it was
      designed to be able to describe anything, but many early
      users of RDF referred to it as metadata technology. RDF used
      the word "resource" rather awkwardly in fact as it turned
      out. In the beginning, many of the things being described
      were documents, and so the online information meaning of
      resource made sense. But in fact in RDF the resource was
      allowed to be anything at all. A class, rdf:Resource even
      used the term as the universal class of all things. A little
      later, the Web Ontology Language decided to use Thing for
      that.
    </p>
    <p>
      RDF came along in what I think was a neat way. It used
      completely existing web protocol extension devices to
      introduce a new system which was fundamentally different from
      the old HTTP+HTML one. The HTML web was a hypertext model,
      which pages and anchors. The RDF model was a knowledge
      representation one of arbitrary things. It did this by using
      the fact that a new language can define whatever it likes as
      what a local identifier denotes. A graphic language might use
      local identifier to denote lines and points. HTML used local
      identifiers to identify hypertext anchors. RDF used them to
      identify arbitrary concepts, people, whatever.
    </p>
    <p>
      The web architecture gave all these languages a common way of
      building a global identifier for the thing denoted by a local
      identifier in a given document. The semantics of the hash
      sign are defined web-wide to mean that "a#b" can be used to
      denote whatever is denoted by "b" in the document denoted by
      "a".
    </p>
    <p>
      Worked a treat. At the beginning of the century, people
      played around and gave all kinds of things URIs like
      "http://example.com/ foo.rdf#color". Some of us did lots of
      work and made all kinds of systems which exchanged and
      integrated data in this way.
    </p>
    <p>
      Two snags occurred, as the years passed. One was that a bunch
      of RDF users got the fact that it was good to use HTTP URIs,
      but didn't get the fact that you should put the foo.rdf
      online so that people can look up what #color means in it.
      And as they didn't do that, they didn't actually bother with
      the "#" at all. The second fly in the ointment was that some
      people wanting to use RDF for large systems found that they
      didn't want to use the "#". This was sometimes because the
      number of things defined in the same file was too low (like
      1) or too large (like a million) and it was difficult to
      divide up the information into middle-sized chunks. Or they
      just didn't like the "#" because it looks weird. But for one
      reason or another people demanded the right to be able to use
      http://example.net/people/Pat to denote Pat rather than a web
      page about Pat.
    </p>
    <p>
      This potentially led to huge failures in the whole RDF world,
      with systems already built which just used
      "http://example.net/people/ Pat" to identify the document
      whether you like it or not. I among others pushed back
      against using non-hash URIs for arbitrary things his but
      eventually gave in.
    </p>
    <p>
      So in response to this, the HTTP protocol was, in fact,
      changed.
    </p>
    <p>
      The spec wasn't changed. The spec editors were not brought on
      board to the new model. The spec was interpreted. The TAG
      negotiated in a way a truce between the existing HTTP spec,
      RDF systems, and people who wanted to use HTTP URIs without
      "#" to identify people. That truce was HTTPRange-14, which
      said that yoiu don't <i>a priori</i> know that a hashless HTTP URI
      denoted a document, but if the server responded with a 200
      then you did, and you had a representation of the document.
      If you did a get on one of these new URIs which identified
      things were not documents (people, RDF properties, classes,
      etc) them the server must not return 200, it can return 303
      pointing to a document which explains more.
    </p>
    <p>
      So the HTTP protocol was, effectively, changed. The HTTP
      protocol as extended now allows HTTP to be used not only for
      Documents but for arbitrary Things. It extends the set of
      things which you can ask a web server about from documents to
      anything. It isn't a very bad design, nor very beautiful.
      Other designs would have worked, but that one was the only
      one which didn't have major problems for some community. It
      could be extended, but basically it works. It would be very
      expensive to reverse it in terms of systems which have been
      deployed.
    </p>
    <p>
      It is also very expensive to go on debating it as though it
      is an open issue. It is reasonable to try to make the
      documents more consistent.
    </p>
    <p>
      Anyway, that is a simplified version of the history of all
      this as I saw it.
    </p>
    <p>
      I would like to see what the documents all look like if
      edited to use the words Document and Thing, and eliminate
      Resource. That's my best bet as to two english words which
      mean as close as we can get to what we want. Note however
      that the web is a new system, a design in which new concepts
      are created, so we can't expect english words to exist to
      capture exactly the concepts. So we take those nearby and
      abuse them as little as we can as far as we can tell at the
      time, and then write them in initial caps to recognize that
      that is what we have done
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 27 Aug 2009 00:00:00 GMT</pubDate>
  <title>My Top Ten terms</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/TopTen.html</link>
    <guid>https://www.w3.org/DesignIssues/TopTen.html</guid>
      <description><![CDATA[
    <h1>
      My Top Ten Terms
    </h1>
    <p>
      "A little semantics goes a long way", says Jim Hendler. He
      uses the phrase in lots of ways. I'm going to use it to say
      that the sense that a few ontological terms are so common
      that they will pull together all kinds of data ion all kinds
      of places. ___@@ at Mitre calls this "loose coupling: the
      idea that you would connect things through these few terms,
      and leave other connections to future projects. He talks
      about "Who, where, when?"
    </p>
    <p>
      These are my ten terms for todays selected on the basis that
      they are important common dimensions, and if more data was
      connected using more of them then the data in the world would
      be a whole lot more connected. They are chosen from a
      pragmatic standpoint. There are probably
    </p>
    <h2>
      People
    </h2>
    <p>
      Let's start with "Who?" . I've picked two terms from the
      Friend of a Friend ontology because it is an ontology
      designed and used specifically in a bottom-up way to connect
      many people through nets of "knows" links between
      acquaintances.
    </p>
    <h3>
      foaf:mbox
    </h3>
    <p>
      The email address is the practical way most people on the net
      are identified. Web sites use email callback to establish
      that someone really does have the given email address. Yes,
      it isn't perfect, as
    </p>
    <h3>
      foaf:name
    </h3>
    <p>
      "When"? There are large numbers of quite complete ontologies
      of time, which use different models to discuss the
      complexities of the different scales, and of things like
    </p>
    <h3>
      cal:dtstart
    </h3>
    <h3>
      cal:dtend
    </h3>
    <p>
      "Where" is a huge one.
    </p>
    <h3>
      geo:lat
    </h3>
    <h3>
      geo:long
    </h3>
    <p>
      Documents are of course the lifeblood of the Web and the
      Semantic Web. The first call for structured data, which gave
      rise to RDF, was for data about documents. The Dublin Core
      group sat down to pick their top ten properties for
      docuemnts. Of the ones they produced, dc:title is the one I
      have found the most call for. I take the html &lt;title&gt;
      tag to have the same meaning.
    </p>
    <h3>
      dc:title
    </h3>
    <h3>
      foaf:maker
    </h3>
    <p>
      In a fractal world, the total cost of ontologies for any
      project is reduced bny the fact that ontologies already exist
      in various commiunities
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 06 Feb 1997 00:00:00 GMT</pubDate>
  <title>User Interface in a consistent world</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/UI.html</link>
    <guid>https://www.w3.org/DesignIssues/UI.html</guid>
      <description><![CDATA[
    <h1>
      Cleaning up the User Interface
    </h1>
    <p>
      Tim BL 6 Feb 97
    </p>
    <p>
      We can talk in the abstract about the sorts of things we'd
      like to do, but it is a wise thought experiment, to imagine
      how the user interface to a more powerful Web would be. So
      here is a an attempt to do that thought experiment. There are
      no screen mockups -- as I said, you have to use your
      imagination. &nbsp;
    </p>
    <h3>
      <a name="Consistency" id="Consistency">Consistency</a>
    </h3>
    <p>
      First of, all, the interface to a universal space should have
      a certain universal consistency. Currently, the user
      interface yu see on your PC or Mac has a few unfortunate
      inconsistencies which make it more difficult to use, and less
      powerful. One is that the "desktop" interface to the file
      system is different from the "browser". (See talk at Boston
      Web Conference WWW4, Dec 1995). It is crazy, if you think
      about it, that the whole screen is use to represent the
      information which happens to be on your local file system,
      using the metaphors of folders, while one window is used to
      represent the information in the rest of the world, using the
      metaphor of hypertext. What's the difference between
      hypertext and a desktop anyway? You can double click on
      things you find in either. Why can't I put folders into my
      hypertext documents? Why can't I write on the desk? Folders
      should be just another sort of document. My home page could
      be one, or it could be a hypertext document. The concepts of
      "folder" and "document" could be extended until they were the
      same, but I don't think that that would be necessarily a good
      idea. It's OK to have differet forms of object for distinctly
      different uses.
    </p>
    <h3>
      <a name="Location" id="Location">Location fixation</a>
    </h3>
    <p>
      The only fundamental difference between a hypertext document
      and a folder is that there is a special relationship between
      any document and the folder where is "is". Unix allows a file
      to be in more than one directory (with hard links) provided
      they are on the same disk. DOS requires a file to be in just
      one directory, which again must be on the same disk. Users
      should not have to worry about what disks files are on.
      Suppose the system just files everything under its creator an
      d its creation date, and the rest of the system is just
      pointers (soft links, hypertext links, shortcuts), Then a
      folder becomes a useful sort of document for helping us
      organize things, but not a container with implications on
      physical storage. The relationship between a folder and a
      file in it becomes just a hypertext link. Ah, consistency!
    </p>
    <h3>
      <a name="Protocol" id="Protocol">Protocol - Smotocol</a>
    </h3>
    <p>
      Another inconsistency is the current strange division between
      mail, browser, and news reader tools. Each have editors. The
      editors are in some cases plain text, and in some cases
      fancier things such as HTML. On the Internet, a mail agent
      allows you to use the Simple Mail Transfer Protocol, a news
      agent allows you to use the Network News Transfer Protocol,
      and a web editor allows you to use the Hypertext Transfer
      Protocol. This is of course totally meaningless to a user.
      From the user's point of view, the mail program allows you to
      move data between mailboxes (folders by any other name) and
      also allows you to link it into someone else's "In" box. I
      say "link" as it creates the relationship "this message (file
      by any other name) is in this mailbox (folder)", which we
      said above should be a link.
    </p>
    <p>
      A news editor allows you to link a news document to a widely
      visible group (box, folder by any other name) which is
      visible to people all over the world. Functionally, it is
      very like mailing something to a list of people, as it
      creates links to the document from groups in each of their
      news readers.
    </p>
    <p>
      A web editor allows you to upload a document into a web
      server, though the way in which you do that varies. (The
      original meaning of the HTTP "POST" operation was to have been
      a very equivalent "make new document and make link from this"
      operation.)
    </p>
    <p>
      Now suppose you want to create a bit of information and you
      want to link it from a few individual's "in" boxes, a few
      news groups and a few hypertext documents. You also want it
      to show up in various folders. Which application should you
      use?
    </p>
    <p>
      Clearly, this is a choice which the user should not have to
      make. Conceptually, a number of links are being made. In
      practice, various protocols will be used by the system. In
      the future, combined protocols may exist which efficiently
      perform all these functions as appropriate. Let's not bother
      the user with this. When I create an object, I want to pick
      the type of object I am creating. I also think it is
      reasonable, if anyone else is going to have access to the
      object, for me to specify the access list and the
      distribution terms: I am controlling my new intellectual
      property. It is also useful for me to specify what sort of
      quality of storage I want for the document. Do I want it
      archived reliably for posterity? Do I want it instantly
      available very rapidly and reliably? The answers to these
      questions will determine where the system will store my data.
      The last thing I want to be asked is the filename.
    </p>
    <p>
      The "save as" filename dialog box is one of the things
      currently holding up our civilization. It doesn't ask the
      right information from the user. It asks it not when you are
      creating the document (and thinking about it at that high
      level) but when you have finished and are about to do (and
      thinking about) something else. [See <a href="http://www.cooper.com/articles/vbpj_secondary_storage_dilemma.html">
      Cooper on this</a>]. As you move information from your head
      into a computer, everything can be intuitive until this step
      asks you to think of the disks and the operating system.
    </p>
    <table border="1" cellpadding="2">
      <tbody>
        <tr>
          <td>
            Creators of documents should be able to specify
            <ul>
              <li>Access lists and distribution terms
              </li>
              <li>Quality of storage
              </li>
            </ul>
          </td>
        </tr>
      </tbody>
    </table>
    <p>
      Of course, I won't want to be negotiating each of these
      parameters every time I create a new object, so probably I
      will have a set of templates, standard genres of document,
      for which these things have been set. In the case of HTML
      files, I will probably want to associate a default content
      and a default style sheet with each template. This will make
      it easier for me and for my readers to get an intuitive feel
      for the access and archival status of documents at a glance.
    </p>
    <p>
      I will get back a URI from this operation. The quality of
      storage is part of the agreement between me, as a creator of
      an object, and the service I use to create and support that
      URI.
    </p>
    <p>
      Sure, people like to pick URIs so that they can be mnemonic.
      I don't mind that. The problem is when the URIs are picked so
      that it is difficult to support them in the future. (See the
      section on naming, and HTTP PUT).
    </p>
    <p>
      Let's assume that this inconsistency will be dealt with in
      future.
    </p>
    <h3>
      <a name="Signing" id="Signing">Signing: From Documents to
      Deeds</a>
    </h3>
    <p>
      So the consistent user interface we have so far is one in
      which we are at home with documents and links, and we
      communicate by massaging the documents and the links.
      &nbsp;The whole thing is "quasi-static" -- at any one time
      you can believe you understand a part of it. &nbsp;It has
      state. When you change its state you can see what you are
      doing.
    </p>
    <p>
      That's not enough. &nbsp;It's not enough because in this
      model everything is malleable: every document is a "living"
      document. Everything can change. &nbsp;This is great for an
      encyclopaedia but its no good for a check book. &nbsp;It
      contains description, but because action is represented only
      in the limited constrained form to those actions which can be
      viewed as changes of state, there are things we just can't
      do. We can do idempotent actions -- those which are just as
      good if you do them twice as if you do them once. &nbsp;This
      doesn't work for paying bills.
    </p>
    <p>
      We can introduce actions which count (or if you like,
      "actions which you can count") in two ways which boil down to
      the same thing. &nbsp;One way is for us to enrich the concept
      of operations on the net from "GET", "PUT" and "LINK" to a
      whole plethora of different functions. &nbsp;We would
      probably divide were sources by their object "class", which
      would define what set of operations are available for each
      resource. This is the familiar world of distributed object
      oriented programming.
    </p>
    <p>
      The second way is that we would allow, in the user interface,
      documents to be signed. &nbsp;A signature on paper is a
      special thing (in principle). &nbsp;It is a countable
      operation. &nbsp;You make a signature on the document and it
      becomes something different. &nbsp;No longer duplicatable at
      will, you act of signing is caught -- and you can be held to
      it. &nbsp;The document becomes something which has been done:
      a <b>deed</b>. This is the familiar world of legal process.
      &nbsp;On the web, it happens when you press a "submit" button
      and your order is submitted to the mail order company. When
      you make a document into a deed, it freezes. &nbsp;You can't
      change deeds. &nbsp;You can revoke them or cancel them out
      with other deeds, but you can't change one. Deeds are not
      living documents. &nbsp;In fact, lots of documents are in
      practice frozen in that they aren't going to change much.
      &nbsp;But they may not be deeds: noone has explicitly made an
      action on them. &nbsp;
    </p>
    <p>
      The two forms are equivalent, as when you make a deed, you
      can in the document write the name and parameters of any
      function you want executed, and submit it to a remote
      operation agent. Similarly, whenever a remote non-idempotent
      &nbsp;operation is made using some Remote Procedure Call
      protocol, in practice the protocol involves making some
      message up containing the parameters, and directly or
      indirectly putting on it some sequence number or identifier
      to prevent it from being accidentally operated on twice.
      &nbsp;Both are abstract representations of a commitment, and
      action which counts.
    </p>
    <p>
      As we're talking about user interface here, I'd like to see a
      clean interface for making a deed, which makes it quite clear
      to me that I am committing something, and not just doing
      another search. &nbsp;I suppose I have a set of "rubber
      stamp" icons which leave a name/date stamp on a document.
      &nbsp;Different stamps can be made with different levels of
      security. &nbsp;They may represent actions by different
      people, different roles, with different levels of authority.
      &nbsp;I guess I'd have one for stamping W3C Recommendations
      which have been though the process, and a totally different
      one for ordering medium sized purchases on my credit card.
    </p>
    <p>
      Deeds don't have to be signed digitally (but it helps).
      &nbsp;Every time you press "send" in your email you are
      making a deed. The document freezes. &nbsp;You may even be
      digitally signing it. You lose control -- you can't take it
      back. &nbsp;
    </p>
    <p>
      Socially, we will have to accept electreonic deeds. Also, we
      will have to define the limits of commitment which someone
      can imply by changing living documents without explicitly
      making a deed.
    </p>
 
    <h3>
      <a name="OhYeah" id="OhYeah">The "Oh yeah?" button</a>
    </h3>
    <p>
      See also WWW4 Boston talk.
    </p>
    <p>
      Deeds are ways we tell the computer, the system, other
      people, &nbsp;the Web, to trust something. How does the Web
      tell us?
    </p>
    <p>
      It can happen in lots of ways but again it needs a clear user
      interface. &nbsp;It's no good for one's computer to be aware
      of the lack of security about a document if the user can
      ignore it. But then, most of the time as user I want to
      concentrate on the content not on the metadata: so I don't
      want the security to be too intrusive. The machine can check
      back the reasons why it might trust a document automatically
      or when asked. Here is just one way I could accept it.
    </p>
    <p>
      At the toolbar (menu, whatever) associated with a document
      there is a button marked "Oh, yeah?". &nbsp;You press it when
      you loses that feeling of trust. &nbsp;It says to the Web,
      "so how do I know I can trust this information?". The
      software then goes directly or indirectly back to
      metainformation about the document, which suggests a number
      of reasons. These are like incomplete logiocal proofs. One
      might say,
    </p>
    <blockquote>
      "This offer for sale is signed with a key mentioned in a list
      of keys (<i>linked</i>) which asserts that tthe Internet
      Consumers Association endoses it as reputable for consumer
      trade in 1997 for transactions up to up to $5000. The list is
      signed with key (<i>value</i>) which you may trust as an
      authority for such statements."
    </blockquote>
    <p>
      Your computer fetches the list and verifies the signature
      because it has found in a personal statement that you trust
      the given key as being valid for such statements. That is,
      you have said, or whoever your trusted to set up your profile
      said,
    </p>
    <blockquote>
      "Key (value) is good for verification of any statement of the
      form `the Internet Consumers Association endoses page(p) as
      reputable for consumer trade in 1997 for transactions up to
      up to $5000. '"
    </blockquote>
    <p>
      and you have also said that
    </p>
    <blockquote>
      "I trust for purchases up to $3000 any page(p) for which `the
      Internet Consumers Association endoses page(p) as reputable
      for consumer trade in 1997 for transactions up to up to
      $5000."
    </blockquote>
    <p>
      The result of pressing on the "Oh, yeah?" button is either a
      list of assumptions on whcih the trust is based, or of course
      an error message indicating either that a signature has
      failed, or that the system couldn't find a path of trust from
      you to the page.
    </p>
    <p>
      Notice that to do this, we do not need a system which can
      derive a proof or disproof of any arbitrary logical
      assertion. The client will be helped by the server, in that
      the server will have an incentive to send a suggested proof
      or set opf possible proof paths. &nbsp;Therefore it won't be
      necessry for the client to search all over the web for
      the&nbsp;path.
    </p>
    <p>
      The "Oh, yeah?" button is in fact the realively easy bit of
      human interface. Allowing the user to make statements above
      and understand them is much more difficult. About as
      difficult as programming a VCR clock: too difficult. So
      I&nbsp;imagine that the subset of the logic language which is
      offered to most users will be simple: certainly not Turing
      complete!
    </p>
    
    <h2>
      <a name="Specific" id="Specific">Specific notes</a> on
      Windows UI and Typical Web browsers
    </h2>
    <p>
      (1997)
    </p>
    <p>
      The gist of it is the need for greater consistency in the UI
      and the underlying system.
    </p>
    <h3>
      Consistency
    </h3>
    <p>
      Some basic principles:
    </p>
    <p>
      1. Anything of any value and persistence must have a URI so
      that it can be referenced (yes, I know Microsoft have a
      Moniker scheme but now it has to be URIs to go global).
    </p>
    <p>
      2. Any place I can use a URI I can use any URI.
    </p>
    <p>
      3. Links are an evident as a primary user interface metaphor,
      with a consistent drag/drop or control/drag/drop for link
      creation, and consistent ways of viewing by link type.
    </p>
    <p>
      4. The system should generate persistent URIs wherever
      possible. These can be just URLs in file or http space but
      they should not change. This is a longer term thing.
    </p>
    <p>
      5. Things like start menus, bookmarks, favorites, recent
      document lists, toolbars, should be viewable as discreet
      objects in familiar ways. A Good Thing is the ability to
      easily see and manipulate the start menu in Explorer (right
      click Start). A Bad Thing is a modeful "organize favorites"
      dialog box is the most inconsistent outdated constraining way
      of moving things around I have seen in a while. &nbsp;A
      modeless "Goto bookmarks" is better, but
    </p>
    <p>
      I propose that any set or hierarchical structure should have
      a consistent windows explorer interface. You clearly feel
      that anything which is a set can also have an HTML view. In
      that case, I want HTML views of everything. Look at al the
      containers we have:
    </p>
    <ul>
      <li>Deskto
      </li>
      <li>Folder
      </li>
      <li>Start menu
      </li>
      <li>Toolbars
      </li>
      <li>Web pages
      </li>
      <li>Mailboxes
      </li>
      <li>Bookmarks
      </li>
      <li>Favorites
      </li>
    </ul>
    <p>
      These must have consistent interfaces.
    </p>
    <p>
      Some of these we can do without. Favorites, most of the Start
      Menu, and mailboxes are all ways of classifying things. They
      should all basically have the same interface and frankly I
      don't see the need for them being different. I want to put my
      inbox in my start menu. I want to be able to chose whether
      something I put in a menu is terminal or expanded. In
      general. (I could always put a link to the Favorites folder
      into the start menu but it wouldn't be expanded as a menu.
      I'd like general control over menus.
    </p>
    <p>
      (Examples of things I want to do: Bookmark mail messages of
      interest, indeed any object. &nbsp;Save a web page in a
      mailbox (which would freeze it in the cache and keep a
      pointer to it)
    </p>
    <p>
      ...
    </p>
    <p>
      )
    </p>
    <p>
      Dialog boxes are as we really know bad UI. The "Save As" and
      "Open" dialog boxes are a pain (I always want to be able to
      delete and rename files in them) because the are constraining
      and inconsistent, and they block until you get out. I'd be
      happy for an open box to simply launch an explorer window or
      select a current one, while providing a receptacle for icons
      which are to be opened. I'd like that receptacle (which is
      just the application icon) to be a URI I can save in any of
      the containers above.
    </p>
    <p>
      The history list should of course be a something which works
      across all applications. A time-based view is a neat history
      lis, but it should apply across everything I have done in any
      applicationt. Make it generic so it works on any object. Make
      the index of everything I have done (maybe my whole click
      stream) be available as such an object, to be viewed just
      like a mailbox. I should even be able to make a link into it.
    </p>
    <h3>
      Maito: addresses
    </h3>
    <p>
      A mailto address is a misnomer (my fault I feel as I didn't
      think when we created it) as it is not supposed to be a verb
      "start a mail message to this person", it is supposed to be a
      reference to a web object, a mailbox. So clicking on a link
      to it should bring up a representation of the mailbox. This
      for example might include (subject to my preferences) an
      address book entry and a list of the messages sent to/from
      [Cc] the person recently to my knowledge. Then I could mail
      something to someone by linking (dragging) the mailbox icon
      to or from the document icon.
    </p>
    <h3>
      Modularity: MIME types and Operations
    </h3>
    <p>
      The OS as a concept &nbsp;is currently having trouble with
      modularity -- the module used to be an application which
      provided its own communication and document types. Now the
      MIME types are orthogonal. I see new applications as either
      producing new mime type support or new link functionality (or
      both) but separably.
    </p>
    <p>
      I should be able to mail anything. Suppose I am editing a
      photo. The tool bars are photo editing toolbars. WhenI want
      to mail it, I find the person I want to mail it to (any way:
      not from an adders book forcibly -- I can put my friends in
      my start menu or anywhere or of course find them on the web).
      I drag the document onto the person icon. Now that
      relationship gives me a choice (To, Cc, etc), and I will now
      get an extra toolbar for controlling the mail options. If I
      drag it to a newsgroup, the functionality will be exactly the
      same though the options may be different. I can then the
      links from the message by type (To, Cc, newsgroup, embedded
      hypertext link, version info, etc etc) in a consistent way. I
      regard dragging the icon of the document to a folder as being
      making a link too. The system should stop regarding folders
      as where things are stored. That has got to be decoupled, to
      separate the logical thing (I classify this as important
      travel) from the physical thing (I put this on drive C:).
      That is going to be a steady change, but early simple steps
      are
    </p>
    <ul>
      <li>allow the user to specify the algorithm for defining the
      filenames for new objects of each type, when defining
      templates, with a macro &nbsp;like<br>
        http://my.co.com/BY/&lt;user&gt;/&lt;yymm&gt;/NOTE/&lt;seq&gt;.txt
      </li>
      <li>give a good default set of macros which will result in
      filenames which don't have to change later
      </li>
      <li>ask when defining the template or creating a file without
      a template for a storage quality which will determine in the
      short term the filename &nbsp;and in the long term a way in
      which the system will duplicate, backup and distribute the
      document
      </li>
      <li>ask at the same time for a default Acess Control List for
      the template.
      </li>
    </ul>
    <p>
      When I hit "Send" or "Commit", then the document is frozen,
      and the &nbsp;mail engines and NNTP software [and version
      control software] &nbsp;starts to actual implement the links,
      and the web server allows appropriate access. All this is
      under the hood, which you can lift to see how its getting on
      if you like.
    </p>
    <h3>
      Archival and Access levels
    </h3>
    <p>
      This storage quality is a common parameter which I want to be
      able to change for anything. For example, I may decide that a
      web page or a newsgroup article I want to have on a "personal
      archive" availability. S I just change it with a menu item.
      No "Save As". No filenames. It has a URI already. And then I
      &nbsp;know it will always be available to me on my desktop
      and laptop. Just like that.
    </p>
    <p>
      The two toolbars of persistence level (think of name for
      that) and &nbsp;access level are so important they deserve
      space maybe in that space to the R of the menu bar, or in
      window title banner. At least in icon form. I don't expect to
      have many combinations of them once I have customized the
      combinations I need with the archive and access wizard.
    </p>
    <p>
      The archive status includes important flags for disconnected
      working -- is this object to be replicated and if so how
      often, etc.
    </p>
    <h3>
      Disconnected operation
    </h3>
    <p>
      The system has got toknow what its connectivity is at any
      time. Without a reboot! I would like this to be a switch
      available across the board, switching IPs and diconnected
      operation.
    </p>
    <div class="nav">
    <hr>
    <p>
      <a href="Overview.html">Up to Design Issues</a>; On to
      <a href="Editor.html">Editing interfaces</a>
    </p>
    <p>
      $Id$
    </p>
    </div>
  

]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Dec 1999 00:00:00 GMT</pubDate>
  <title>User agent watch points</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/UserAgent.html</link>
    <guid>https://www.w3.org/DesignIssues/UserAgent.html</guid>
      <description><![CDATA[
    <h1>
      User Agent Watch Points
    </h1>
    <p>
      Interpreting HTTP
    </p>
    <p>
      Browsers and Email programs are <b>user agents</b>. This isn't just
      a formal long term for them, it is an important issue. They
      are programs which act on behalf of, and represent, the user.
    </p>
    <p>
      The computer protocols such as HTTP are defined to carry a
      particular meaning, and it behooves a user agent to
      representthat meaning to the user, or the whol system of
      peeople and machines breaks.
    </p>
    <p>
      Here are a few ways in which browser designs should and often
      have not lived up to this.
    </p>
    <h3>
      Distinguish between HTTP "301 Moved permanently" and "302
      Found"
    </h3>
    <p>
      There are two forms of redirection in HTTP. Each gives a new
      place to look for a resource, but for completely different
      reasons.
    </p>
    <p>
      "<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2">301
      Moved</a>" is a response which indicates that the server has
      committed the unthinkable and for some reason not in a
      position to serve the document at that URI. It indicates that
      all references to the original should be change to the new
      one, including bookmarks and document links. This is an
      expensive solutin to a serious problem. It does not, of
      course, work completely, but it is the HTTP way for a server
      to alert a client of this situation.
    </p>
    <p>
      "<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3">302
      Found</a>" is a response which indicates that the server is
      working as a name server. It is a success result indicating
      that you asked for a good document, and that the actual
      contents can curently be found at the given URI.
    </p>
    <p>
      The imporant use of name server functionality is when for
      some reason it is impractical to server the document directly
      from a server which can hold the persistent URI. For
      example, a university might issue definitive URIs for its
      successful theses, and might have a very reliable, stable,
      low bandwidth machine which handles that URI space. However
      the content of the theses might in practice be delivered by
      fast machines located in the departments. The university may
      make a persistence commitment for the original URI but not
      for the department's server. Similarly, a user may for load
      reasons or speed reasons be directed to a mirror.
    </p>
    <p>
      It is important that when a user agent follows a "Found" link
      that the user does not refer to the second (less persistenet)
      URI. Whether copying down the URI from a window at the top of
      a doucment, or making a link to the document, or bookmarking
      it, the reference should (except in very special cases) be to
      the original URI.
    </p>
    <p>
      Very few browsers (Mozilla? Amaya&gt;) implement this
      properly as of 1999.
    </p>
    <p>
      There is also <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.8">
      307 temporary redirect</a>, which is similar to a 302 Found.
    </p>
    <p>
      See also:
    </p>
    <ul>
      <li>
        <a href="http://www.w3.org/Provider/Style/URI.html">Cool
        URIs don't change</a> (In the Style Guide for Online
        Hypertext)
      </li>
    </ul>
    <h3>
      Distinguish between an HTTP <em>POST</em> and a <em>GET</em>
    </h3>
    <p>
      GET operations (as happen when you follow a regular hypertext
      link) are fundamentally different from POST operations (as
      happen when you submit a form to order a book), The first is
      reversible, has no long term effect, cannot comit a user to
      anything. The latter does committhe user.
    </p>
    <p>
      A graphic client, for example, should use a very different
      <strong>cursor</strong> while the user is hovering over a
      POST button to when the user is hovering over a GET link.
    </p>
    <p>
      Doing a POST is like sending an email. (Currently 1999 it may
      be more secure because it will often happen over an https
      secure connection while many email clients do not encrypt
      messages.) It is really important to be able to find a list
      of the emails you have sent: these are the things you are
      committed to. The same applies to HTTP POST forms. The web
      client should <strong>keep a record</strong> of POSTs which
      have been submitted.
    </p>
    <p>
      This would of course waste a lot of space for those web sites
      which get GET and POST muddled, but they are fundmentally
      broken anyway and the sooner we just fix this misuse on all
      sides the better. In the future, digital signature will be an
      action just like POST, but with weight added and the user
      awareness of the choice of key. Understanding when a
      commitment is made is a really important part of the user
      interface. Get it right.
    </p>
    <p>
      See also:
    </p>
    <ul>
      <li>
        <a href="Axioms.html#get">Axioms: HTTP GET if and only if
        no side effects</a>
      </li>
      <li>
        <a href="/Protocols/rfc2616/rfc2616.html">HTTP
        specification</a>
      </li>
    </ul>
    <h2>
      Hide URIs
    </h2>
    <p>
      Objective: A web user should never be aware of a URI while
      using the Web, either creatively or browsing.
    </p>
    <h4>
      Techniques:
    </h4>
    <ul>
      <li>Hide all access to URIs inside special "under the hood"
      windows or status bars;
      </li>
      <li>Use titles to identify resources;
      </li>
      <li>Introduce RDF properties to indicate short titles and
      icons to make this easier;
      </li>
      <li>Remove all URI-aware functions to a special (optional)
      menu;
      </li>
      <li>Allow files for upload, or other documents to be
      referenced by drag-and-drop, or copy referemce/link to
      clipped
      </li>
    </ul>
    <p>
      Web <strong>servers</strong> have to help by generating URIs
      for new documents. A new document creation form should
      redirect the user to a document whose UIRI has bene
      generatde, and which the user can then edit.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 15 Dec 2016 00:00:00 GMT</pubDate>
  <title>User Interface Tips</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/UserInterface.html</link>
    <guid>https://www.w3.org/DesignIssues/UserInterface.html</guid>
      <description><![CDATA[User interface ideals can be very subjective. Here are my
    own biases. Much of it may be motherhood and apple pie. In a
    way, more than 25 years after the WWW project started, you'd
    think that these things would be be generally understood. And
    they are covered in many blogs and books. But here are some
    points which still now seem to need pointing out. They may not
    include your own pet peeves -- but here are some of mine. The
    order is a bit random.
<p><a href="https://www.w3.org/DesignIssues/UserInterface.html">Read rest of article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sat, 18 Jan 2025 00:00:00 GMT</pubDate>
  <title>A Vision of a New World</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Vision.html</link>
    <guid>https://www.w3.org/DesignIssues/Vision.html</guid>
      <description><![CDATA[
    <p>This article envisions a transformative world enabled by the
    Solid protocol, a decentralized system for personal data
    storage and sharing. Drawing parallels to the early web’s
    potential, it explores the profound societal and technological
    changes Solid could bring by breaking down data silos and
    fostering interoperability across diverse applications and
    domains.</p>
    <p>Solid’s core innovation lies in empowering individuals to
    control their data while enabling seamless connections between
    apps. From healthcare and education to finance, travel, and
    mental health, the integration of personal data promises
    tailored solutions, improved user experiences, and significant
    economic and social benefits. The Solid ecosystem fosters a new
    era of app development, democratizing access through “no-code”
    tools and encouraging the creation of self-governing
    communities.</p>
    <p>By applying <a href="#metcalfes_law">Metcalfe’s law</a> to
    interconnected Solid Apps, the article highlights the extra
    value generated. It emphasizes the liberation from current
    internet constraints, such as siloed platforms and limited data
    portability, enabling more effective collaboration and
    problem-solving at both individual and community levels.</p>
    <p>The article underscores the economic potential of
    Solid-driven innovations, not as an end goal but as a byproduct
    of enhanced productivity and creativity, collaboration and
    compassion. It envisions a future where empowered individuals
    and communities of all sizes work together to tackle global
    challenges, marking a paradigm shift in how we use technology
    to shape the world.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Vision.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Thu, 27 Aug 2009 00:00:00 GMT</pubDate>
  <title>Roadmap for Web Services</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/WebServices.html</link>
    <guid>https://www.w3.org/DesignIssues/WebServices.html</guid>
      <description><![CDATA[
    <h1>
      Web Services
    </h1>
    <blockquote>
      <p>
        Program Integration across Application and Organization
        boundaries
      </p>
    </blockquote>
    <h3>
      Introduction
    </h3>
    <p>
      Web Services mean many things to many people. In the end,
      there will be a set of standards which allow us to do things
      we could not do before, but in the mean time different people
      and companies approach them from different positions, and
      with different expectations. In 2001-2, Web Services have
      also been a buzzword used repeatedly and claimed to be one of
      the hot new technologies. The common themes are:-
    </p>
    <ul>
      <li>A departure from the web as a quasi-static information
      space to one in which interactions are the primary model
      </li>
      <li>A use of HTTP, XML and other standards from the web
      architecture as the building blocks
      </li>
      <li>A typical focus on enterprise wide and inter-enterprise
      operations
      </li>
    </ul>
    <p>
      The Web in Web Services is, from the first point, a misuse:
      the term Internet Services would be more appropriate. The Web
      comes from the second point - the use of the HTTP and XML is
      already in use as a well-understood and well-debugged set of
      protocols which support the Web, and so it makes sense to
      reuse them in providing remote operations and those things
      connected with them. The third point is what makes web
      service requirements so different from a local RPC system.
      The fact that data is exchanged for business purposes and
      between different social entities means that accountability
      is required, rather than just reliable transmission.
    </p>
    <ul>
      <li>The vendors of software see web services as way to
      repackage existing capability in a way which makes it
      interoperable with other systems.
      </li>
      <li>The security requirements for web services are dictated
      by the trust environments, whether it is intranet or b2b or
      b2c, etc
      </li>
      <li>For b2b one needs not just reliabilioty but
      accountability.
      </li>
    </ul>
    <p>
      The architecture of Web Services is the scope of the W3C Web
      Service Architecture Working Group.
    </p>
    <h3>
      Philosophy
    </h3>
    <p>
      Other articles have dealt with the fundamental architectural
      difference between remote operations and the architecture of
      the information space, and the mappings between the two.
    </p>
    <ul>
      <li>
        <a href="Axioms.html">Axioms of web architecture</a>
        (1990s) talks about the information space concept
      </li>
      <li>
        <a href="PaperTrail.html" rel="chapter">Paper Trail</a>-
        Discusses the relationships between two patterns:
        read/write state derived from read-only documents in real
        life. Which came first, the journal or the database?
      </li>
      <li>
        <a href="Conversations.html">Conversations and State</a>
        (1998) discusses the trends in many areas away from shared
        information spaces, from Web Services to Voice browsing.
      </li>
    </ul>
    <p>
      The Web Services architecture group has produced various
      drafts:
    </p>
    <ul class="deliverables">
      <li class="wd">
        <a href="http://www.w3.org/TR/2002/WD-wsa-reqs-20021114" rel="details" class="title">Web Services Architecture
        Requirements</a> (<span class="date">2002-11-14</span>)
      </li>
      <li class="wd">
        <a href="http://www.w3.org/TR/2002/WD-ws-arch-20021114/" rel="details" class="title">Web Services Architecture</a>
        (<span class="date">2002-11-14</span>)
      </li>
      <li class="wd">
        <a href="http://www.w3.org/TR/2002/WD-ws-gloss-20021114/" rel="details" class="title">Web Services Glossary</a>
        (<span class="date">2002-11-14</span>)
      </li>
      <li class="wd">
        <a href="http://www.w3.org/TR/2002/WD-ws-arch-scenarios-20020730/" rel="details" class="title">Web Services Architecture Usage
        Scenarios</a> (<span class="date">2002-07-30</span>)
      </li>
    </ul>
    <p>
      The architecture uses the following diagram for the highest
      level:
    </p>
    <p>
      <img src="diagrams/ws/Triangle.png" alt="Basic Web services architecture graphic">
    </p>
    <p>
      However, the essential part of Web services is the
      <em>Interact</em> relationship between a Service provider and
      Service requestor. This is the Web Service. Discovery
      agencies need not be used - they will in some cases but not
      in others. The discovery agencies are well represented as a
      cloud, rather than being a well-defined module in the web
      services architecture. They will become interface to a huge
      world of data and query services which provide data about web
      services as well as many other things.The Interact between
      between requestor and provider is the essential defining
      element to web services. As we shall see, the metadata about
      web services
    </p>
    <h2>
      Technologies within the Web Servcies umbrella
    </h2>
    <p>
      There is a mass of different pieces being bolted onto the
      foundations of Web Services provided by WSDL and SOAP 1.2 and
      th diagram implies things considerably. The management layer
      is a supervisory layer allowing the conrol of the many agents
      involved in a web services-based operation. The "Application
      semantics" layer indicates the necessity, for any useful
      interoperability, to have
    </p>
    <p>
      <img alt="Stcak based on XML, and HTTP has WSDL and SOAP 1.2 as WS foundation." src="diagrams/ws-stack.png">
    </p>
    <h3>
      Run Time messaging
    </h3>
    <p>
      The design work of web services is divided between the run
      time protocols and the descriptions of services.
    </p>
    <p>
      The W3C work at runtime based on HTTP transport of
      XML-encoded messages, using the SOAP protocol. (Here by SOAP
      we mean SOAP 1.2, previous versions including early
      proprietory submissions which are not standards or guaranteed
      to interoperate) . There is a bifurcation in the design at
      this point, as SOAP operates basically in two modes.
    </p>
    <p>
      In one, the XML message is used to encode the parameters to a
      remote operation in much the same way as remote method
      invokation in for example, Corba, DCOM, or RMI. In this mode,
      XML is used as the marshalling style, but the system is a
      distributed using remote procedure call in a fairly
      traditional way.
    </p>
    <ul>
      <li>There is a standard marshalling syntax
        <p>
          Interfaces between software modules have well-defined
          functions, which in turn have well-defined and typed
          input and output parameters
        </p>
        <p>
          Stubs (dummy routines which similate the remote procedure
          by a local one which communcates with the remote one) can
          be generated directly from the WSDl definition
        </p>
      </li>
      <li>The remoteness can be transparent, making the design of a
      distributed system similar to the design of a program.
      </li>
    </ul>
    <p>
      In the other mode, SOAP carries an XML document, and the task
      of the receiver is seen more as a document processing
      operation. This is less rigid than the RPC style.
    </p>
    <ul>
      <li>The interface a service provides is defined just by the
      XML schema. This defines the acceptable document types, which
      can allwo extension in many ways, using XML namespaces.
        <p>
          The communication is more apparent to the application
          writer, who deals with the document object model (DOM) of
          the recived message, rather than having parameters
          unmkarshalled automatically.
        </p>
        <p>
          XML tools such as XSLT and XML-Query, and XML encryption
          and so on can be used.
        </p>
        <p>
          It is simpler to use message exchange patterns other than
          the request/response.
        </p>
      </li>
    </ul>
    <p>
      The document mode of SOAP seems to be getting the most
      traction in the ecommerce stack. This is not an accident. The
      XML mode is more flexible than the RPC mode. It is easier in
      principle to extend an XML-based message system to include
      more information as a system grows. In fact, RDF is
      especially powerful in this area, as new information can be
      parsed into an entity-relationship form by old agents, and it
      becomes logically clear which parts can be ignored by those
      who do not understand them.
    </p>
    <p>
      Functionality which has been mentioned as required above the
      basic layer at runtime includes:
    </p>
    <ul>
      <li>Routing. Routine data within message for processing bu
      different agents; defining workflow path of message. Black
      box or white box patterns of design.
      </li>
      <li>Security. Prolfiling existing security technologies for
      use in ebusiness applications using web sevices.
      Authentication and key management.
      </li>
      <li>Packaging of attachments to messages. XML Packaging.
      </li>
      <li>Reliable messaging (delivery, non-duplication, ordering)
      for the case in which the transport layer (such as TCP under
      the HTTP) doesnot provide this. (TCP does provide this
      reliability but (a) systems are not designed to keep TCP
      connections open for the weeks or years over which a web
      service may run, and (b) TCP does not provide accountability
      so you can show the tax man the acknowledgement of receipt 7
      years later.)
      </li>
    </ul>
    <h3>
      <a name="Descriptio" id="Descriptio">Description</a>
    </h3>
    <p>
      The descriptions of services are made at various different
      models and different levels of abstraction in different specs
      proposed as part of the stack, though there is agreement on
      WSDL as the modelling of the lowest level, the message or
      request/response interaction, and the binding to the specific
      HTTP (say) port at which it happens.
    </p>
    <p>
      <img alt="Lots of concepts interconnected" src="../2003/Talks/0521-hh-wsa/wsa_concepts.png">
    </p>
    <p>
      Higher layers in the description above WSDL are known
      variously as coordination, orchestration, choreography,
      composition.
    </p>
    <p>
      They involve (compared with basic WSDL), for example:
    </p>
    <ul>
      <li>Protocols involving more than two messages
      </li>
      <li>Protocols having a common shared state over a long period
      </li>
      <li>Protocols having more than two parties involved; web
      service workflow
        <ul>
          <li>structured version (ws-*, damls process model)
          </li>
          <li>precondition-postcondition style (DAMLS)
          </li>
        </ul>
      </li>
      <li>The protocols as business protocols, in terms of common
      business functions
      </li>
      <li>The relationship between allowed transitions in the
      protocol and the content of messages. For example, the
      requirement for a transaction ID to match across a
      transaction, or for possible responses to be a function of a
      request code.
      </li>
    </ul>
    <h3 id="Choreograp">
      Composability and Choreography
    </h3>
    <p>
      Composability of web services refers to the building, from a
      set of web services, of something at a higher level,
      typically itself exposed as a larger web service.
    </p>
    <p>
      Choreography refers more abstractly the part of the
      description of web services which defines a way, or the ways,
      in which a acyual invokations to various web services work
      together. (Peltz uses <em>Choregraphy</em> when it involves
      multiple parties, and <em>Orchestration</em> when it is
      internal to one party. Thus the former crosses application
      boundaries, the latter also crosses organization boundaries.
      here we use Choregorahy in general for both.)
    </p>
    <p>
      There is so small amount overlap here, which has led to some
      confusion. To be general, one might say, for example, that a
      flight confirmation must involve an already reserved flight.
      This the actual constraint. One can describe a particular
      choregraphy (a particular dance, if you like) in which a
      flight query service is called, and produced a list of
      flights, and then a reservation service is called to reserve
      the flight, that is successful, and the resulting reservation
      is passed to the confirmation service. It may be that there
      are other ways -- other choreographies -- in which one could
      have achived a reserved flight. The engineer has the choice
      of modelling the many possible ways all in one choreogrpahy,
      or of making several choreographies.
    </p>
    <p>
      Web services <em>can</em> be combined in such as way that
      messages are passed around in a very random fashion. However,
      a particular design techniqe is for a master process to
      delegate to other services in a recusive tree-like manner, as
      has been de rigueur in programming languages since Pascal.
      For example, if the consumer asks the travel agent and the
      the travel agent books a hotel, the hotel will reply to the
      travel agent, not to the consumer. This makes everything
      orderly.
    </p>
    <p>
      WSCI, BPML and BEPL take this approach to choreography. This
      is a programming language approach with
    </p>
    <ul>
      <li>Sequential, Parallel and Exception execution, loops &amp;
      conditionals
      </li>
      <li>Message-passing rendezvous between processes
      </li>
      <li>Calls: Web Services
      </li>
      <li>Data: bits of XML
      </li>
      <li>Assignment to variables
      </li>
      <li>Expressions: XPath 1.0 plugable in BPEL
      </li>
      <li>Does not handle actual calculations, rules etc.
      </li>
    </ul>
    <p>
      WSCI has the empahsis on description, and BPEL on being able
      to compile to an executable agent. As neither is intended to
      do the actual calculations or business rules, it would be
      closer to compare themm with scripting shells such as bash
      which handle concurrency and synchronization but actually
      call programs (or rather web services) to do the real work.
    </p>
    <p>
      See:
    </p>
    <ul>
      <li>WS Choreography Group
      </li>
      <li>IBM, Microsoft and BEA, under OASIS, <em>BPL4WS</em> (not
      W3C, not RF).
      </li>
      <li>BPMI, Business Process Modelling Language BPML
      </li>
      <li>Sun et al: <em>Web Services Choreography Interface
      (WSCI),</em> W3C Note
      </li>
      <li>IBM specs ws-coordination, ws-transaction,
      ws-orchestration
        <p>
          Chis Peltz, Hewlett Packard, <em><a href="http://devresource.hp.com/drc/technical_white_papers/WSOrch/WSOrchestration.pdf">
          Web Services Orchestration</a> - a review of emerging
          technologies, tools, and standards.</em>
        </p>
      </li>
    </ul>
    <p>
      @@ Different attitudes - top down program design, or
      bottom-up agent design, bottom up document design.
    </p>
    <h3>
      Message-oriented choregraphy
    </h3>
    <p>
      The <a href="PaperTrail.html">Paper Trail</a> concept is that
      the state of a mult-agent multi-process system can be looked
      at, sometimes rather effectively, as a function of the
      documents which have been transmitted.
    </p>
    <p>
      The process-oriented attitude to a bank-customer relationship
      may be "In parallel, the customer writes checks, merchants
      pay in checks, credit card transactions happen, all month.
      Then, the charges, interest are assessed and a bank statement
      sent from the bank to the customer". The document-, or
      message-oriented one is more like "Every month a bank balance
      lists valid transaction dated that month. A cleared incoming
      check in a valid transaction. A cleared outgoing check is a
      valid transaction. A validated credit card debit is a valid
      transaction. A check is cleared if it is incoming and there
      is a matching transfer from the payee bank", and so on. This
      builds the relationships up in a bottom-up, weblike way. The
      process-oriented attitude suggests the bank be written as a
      procedure in a top-down way using for example WSCI and BPL.
      The document-oriented attitude suggests the use of business
      rules systems triggered by the receipt of new information --
      new documents, in this case new web services messages.
    </p>
    <p>
      (Web service messages are of course documents just like
      documents sent in email. Messages are particular in that they
      have a particular time of transmission, and their document
      content sdo not change. They do of course generally have
      identifiers, and even though they can only be accessed by
      sender and explicit receivers, they can still be regarded as
      part of the web by those parties.)
    </p>
    <p>
      Whether the design process is a top-down process-oriented one
      or a bottom-up document-oriented one, the design will have to
      be translated into a set of agents and their responses to
      incoming messages. This manipulation can of course be done
      automatically.
    </p>
    <p>
      A concern in all this frantic design is it evolution with
      time. A BPEL script sets out to be a description of a a
      business process at a high level. The critical values which
      decide on conditional execution, or which correlate a
      particular process with a given transaction, are expressed as
      parts of the structure of the XML messages. This may lead to
      what has been called "DTD fragility". What happens which you
      change the DTD? The design of the message types with XML
      schema is the sort of thing which is difficult to get
      everyone in a company to agree on, and tents to change with
      time. There are many arbitrary choices made as to how the
      knowledge in the message is serialized as XML. Moving to RDF
      may, by removing a layer of arbitrary design, reduce that
      fragility and allow web sevice choregraphy to evolve with
      time within and outside a company.
    </p>
    <h3>
      <a name="Process" id="Process">Process modeling</a>
    </h3>
    <p>
      When considering a business system with multiple agents and
      multiple concurrent processes, one would like to have an
      automated way of checking some fundamental questions.
    </p>
    <ul>
      <li>Will the process necessarily terminate?
      </li>
      <li>Will the service respond within a given time?
      </li>
      <li>Will the net gain from a sale always be positive?
      </li>
      <li>Will we ever promise to ship something we do not have in
      stock?
      </li>
    </ul>
    <p>
      and so on.
    </p>
    <p>
      The pi calculus and other calculi derived from it are formal
      ways of modeling systems with multiple agents and multiple
      processes. They can do some way to answering these questions.
      Rule-based systems can also be designed so that proofs can be
      found of these sort of conjecture. This is a good reason for
      keeping the languages involved as simple as possible It may
      be the design reason for the limitations on computational
      power in WSCI and BPEL.
    </p>
    <p>
      See petri nets (IBM; and stadford), Pi calculus @@ refs.
    </p>
    <p>
      Much of the functionality is seen in terms of tying web
      services down to well-known functionality such as exiting
      transaction processing systems, PKI trust infrastructure, and
      so on.
    </p>
    <h3>
      <a name="Discovery" id="Discovery">Discovery</a>
    </h3>
    <p>
      In a large number of applications, web servcies will be
      provided by on one hand and used by on the other peers who
      have established relationships. Indeed, until a trust
      infrastructure is fairly developed it is not reasonable to
      expect computers to do automatic comparison shopping for very
      many services. Web services will probably (like the web in
      1993, and the Semantic Web in 2003) spread first within the
      corporate firewall, where security problems are minor and
      mistakes less embarassing than inter-enterprise or
      publically. However, the goal is that so many web services
      should be available that it will be important to be able to
      find them in all kinds of ways.
    </p>
    <p>
      The UDDI project and the related work on description and
      query systems is aimed at this. A positive aspect of UDDI is
      the definition of an ontology for web services. Problems with
      it are that it is centralized by design, both in the
      single-tree ontology, and in the design based fundamentally
      on a central registry, with inter-registry operation as a
      secondary thing.
    </p>
    <p>
      From the semantic web point of view, web services are simply
      one aspect of the many things which will be searched for.
      Indeed, the fact that a web service is provided may in fact
      be rather incidental to the essential nature of the business
      item which is discovered -- a trader in stocks, a seller of
      lawnmowers, and so on. The semantic web aims to describe any
      aspect of anything, including the catalogs, parts, materials,
      services organization, relationships and contracts. A query
      system which addresses web services only makes sense when
      smoothly integrated with the rest of the web of enterprise
      knowledge.
    </p>
    <h3>
      <a name="Services" id="Services">Web Services and Semantic
      Web</a>
    </h3>
    <p>
      The question of the relationship between these two activities
      is constantly in the air.
    </p>
    <ul>
      <li>The whole description side is a clear semantic web
      application, and so long as XML languages are defined which
      introduced with english language specs but no RDF mapping,
      there is a potential ambiguity which will have to be resolved
      later in making that mapping, there is an inability to use
      common semnatic web tools, and there is cost down the road
      assuming semantci web tools will eventually be used.
      Essentailly, web serv ices become instant legacy technoplogy
      for the semantic web.
      </li>
      <li>The DAML-services collalition of researchers is tackling
      the job of service description at a higher level.
      </li>
      <li>Many things which are described as web services can in
      fact be described as the publication of a series of semantic
      web documents, just as the billing of a peer company is in
      reality effected by the issuance of an invoice.
      </li>
      <li>When Semantic Web agents query each other, they could use
      SOAP (though a direct encoding into an HTTP URI may also be
      effective).
      </li>
      <li>When Semantic Web agents update each other, they should
      use SOAP, running typically over HTTP POST.
      </li>
    </ul>
    <p>
      The argument against integration of the technologies is
      mainly social. It is costly to coordinate very large groups.
      It is much more effcient to develop WS and SW independently.
      Neither side has a great incentive to take on the learning
      required to absorb the needs and potentials fo the other.
      Using technology in preparation by another group takes a
      great leap of faith, and does really add to the development
      time. These are real issues. So while it may take more effort
      in the long run, it is a better parallelization of the design
      task to allow web services and semantic web to proceed in
      together without a mandated link. (This was the apparent
      consesnsus of the W3C AC meeting in Nice, 2000/11)
    </p>
    <p>
      That said, wherever overlap of expertise between the
      technologies occurs, those who form a bridge should do their
      best to make the conceptual differences as small as possble.
      There is a Semantic Web Services group, connected to the
      ______
    </p>
    <h3>
      Service design tools
    </h3>
    <p>
      Most modern software design differentiates strongly between
      the design of an interface, and the design of the software
      which implements it.
    </p>
    <p>
      Web services are required to be composable - you should be
      able to make a web service implmentation by building it out
      of component web services. At the low level, think of making
      a latittude/longitude to state code converter composed from a
      latittude/longitude to postal code converter and a postal
      code to state code converter. At a high level, think of a
      making a vacation being composed of resrvations of flights,
      hotels and entertainment.
    </p>
    <h3>
      Runtime System management
    </h3>
    <p>
      Real web services have multiple agents running commerical
      environments, in which downtime is expensive, and incorrect
      operation could be disasterous. The running, monitoring,
      provisioning and upgrading of such systems clearly requires
      tools, but their design is out of scope for this overview.
    </p>
    ]]></description>
  </item>
    
    <item>
      <pubDate>Wed, 01 Mar 2017 00:00:00 GMT</pubDate>
  <title>Webizing an existing application</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Webize.html</link>
    <guid>https://www.w3.org/DesignIssues/Webize.html</guid>
      <description><![CDATA[

<h1><a name="Webizing" id="Webizing">Webizing existing systems</a> </h1>

<p><em>This discusses the introduction of URIs as names in a system to scale
it to the web.</em> </p>

<p>The web is extended in two ways - by adding new bits of technology to the
existing stuff, and by "webizing" existing applications and systems. Webizing
is really important, not only as a way of bootstrapping the web using large
amount of legacy information, but because the existing systems have been
researched and designed over the years and it is really important we do not
lose the knowledge accrued during that process. </p>

<p>The essential process in webizing is to take a system which is designed as
a closed world, and then ask what happens when it is considered as part of an
open world. Practically, this effect on a computer language is to replace the
names/tokens/identifiers for URIs. Thus, where before reference could only be
made to something in the same document/program/module one can with equal ease
make reference to something in a different one somewhere in that abstract
space which is the Web. </p>

<p>In a clean case, this will be done so that the URI for an object is rather
naturally related to its representation in the original language. For
example, the element with ID "foo" in bar.xml is bar.xml#foo. However, to do
the same for an attribute defined in a DTD or schema is more difficult,
because of the complex nature of the spaces and subspaces for element and
attribute names in XML. It is great when the webized language is very similar
to the original language, and ideal when it actually compiles. Dan Connolly's
2000/8 <a href="#Connolly">webization of KIF</a> uses URIs for identifiers,
but to be accurate because URIs are case sensitive and KIF tokens not, lower
case letter had to be marked with escaped with backslashes in the translation
which made the result less readable. Changing the underlying language in
small ways can make the translation much less cumbersome!. </p>

<p>Here is a slightly flippant view on the webize() function, each row of
which probably needs an essay of explanation, but provided here without
any.</p>

<table border="4">
  <caption></caption>
  <tbody>
    <tr>
      <td>x</td>
      <td>webize(x)</td>
    </tr>
    <tr>
      <td>Hypertext</td>
      <td>WWW</td>
    </tr>
    <tr>
      <td>Data</td>
      <td><a href="LinkedData.html">Linked data</a></td>
    </tr>
    <tr>
      <td>Top-down structured design</td>
      <td>Bottom-up ontology design</td>
    </tr>
    <tr>
      <td>Data Hiding</td>
      <td>Data Re-use</td>
    </tr>
    <tr>
      <td>Goto Considered Harmful</td>
      <td>Goto drives the economy</td>
    </tr>
    <tr>
      <td>unix file system</td>
      <td><a href="CloudStorage.html">ACL'd r/w linked data</a></td>
    </tr>
    <tr>
      <td>Large-scale structure: Hierachy</td>
      <td><a href="Fractal.html">Large-scale structure Scale free</a></td>
    </tr>
    <tr>
      <td>"Tired"</td>
      <td>"Wired"</td>
    </tr>
  </tbody>
</table>

<h3><a name="Example" id="Example">Example - webizing a database</a> </h3>

<p>Imagine that a database is to be made available on the web in RDF. Suppose
the database itself will have a URI of http://weather.org/current An SQL
database is essentially a closed world, in that the various thing in it were
not designed to be linked to from outside. An SQL statement </p>
<pre>SELECT temp, zip  FROM weather WHERE temp  &gt; 30</pre>

<p>makes reference to terms which have meaning within the database. There is
no reference in that statement to the database - that is simply part of the
context. </p>

<p>Now suppose we determine what the URI will be for the pieces of the
database, perhaps current/weather for a table, and current/weather.temp for a
column in a table. We could then expend the syntax (excuse my SQL - I am
making this up) </p>
<pre><span style="color: #FF0000">USING c FOR http://weather.org/current</span><br style="color: #FF0000">

<span style="color: #FF0000">USING u FOR http://places.org/usa</span><br>

SELECT <span style="color: #FF0000">c:</span>readings.temp, <span style="color: #FF0000">u:</span>location.lat, <span style="color: #FF0000">u:</span>location.long
  FROM JOIN <span style="color: #FF0000">c:</span>readings, <span style="color: #FF0000">u:</span>location
  WHERE <span style="color: #FF0000">c:</span>readings.zip = <span style="color: #FF0000">u:</span>location.zip
  AND <span style="color: #FF0000">c:</span>readings.temp &gt; 30;</pre>

<p>This is an (incorrect I expect @@@) SQL which links out of the local
database to combine it with information from a remote one. This syntax I am
sure won't work in practice, but should illustrate the principle. Namespaces
c and u are introduced for two reasons: for brevity, as repeating them in the
code would have been too cumbersome; and for syntactic reasons as URIs tend
to contain characters which would be ambiguous with other syntax is allowed
in SQL column names. </p>

<p>Of course, whether actually SQL on a set of scattered databases is
valuable may be questionable - it may not optimize as well as some other
query languages. However, suddenly the things defined by the database are
available to the outside world. For example, the concept of temperature
reading as used by weather.org in its database of current conditions </p>

<p><code>http://weather.org/current/readings.temp</code> </p>

<p>is now a concept, an RDF property in fact, which is available for all the
world to refer to. These references need not all be in SQL. Because the
schema for the database will declare it to be an RDF property or something
equivalent, many different systems can use the information and refer to the
concept. </p>

<h4 id="Notes">Notes specifically on this example </h4>

<p>I note, before we leave this example, that there are two concepts
important to a table. One is the type of thing described by a row. A row in
the reading table, for example, defined a weather reading, something which
had a location and temperature and humidity and place. The other concept is
the set of objects which are actually in the table. In the classic SQL
example of the employees table, there is a rdf:class employee, subclass of
person, and also the fact that someone works for the company iff they are in
the table. </p>

<p>A second note on exporting databases. When you really put something on the
web, there is often, for flexibility and security, a layer between what you
expose and the internal storage. Just as web pages are not files though often
closely related to files, and have the same form - a string of bytes and a
MIME type. Exposed remote operations are not local procedures though closely
related to them, and have the same form -- a service URI and a method name
and parameters. Similarly one would probably export a derived view of a
database in many cases - one which would have the form of a database. This
allows different engineering decisions to be made on the external
manifestation (persistent and what the customer wants) and the internal form
(efficient and convenient for you). </p>

<h2 id="webize">Webizing nested languages </h2>

<p>Sometimes this is easy and sometimes it is hard. It is hard, for example,
when the language uses nested scoping to great effect. In this case there is
a very large amount of context which is completely different between the
beginning and end of such a link. The <em>go to</em> instruction is
considered harmful [<a href="#GTCH">ref</a>] by Dijkstra because it "<em>as
it stands is just too primitive; it is too much an invitation to make a mess
of one's program</em>." This of course is true of the hypertext link too, in
a way. Both allow an open webbed world which typically, if used with no
restraint, remove rules which give sanity and analysability to a language and
allow optimization of the code compiled. So, just as some languages prevent
one from jumping into or out of an inner loop of a program, so it may make no
sense to allow a link to be made into something within a nested structure,
because the referenced thing just does not have any meaning when taken out of
context. </p>

<p>When dealing with language which have nested context, it may be necessary
either to define how something inside represented independently of context,
or to make it impossible. </p>

<p>Be careful, though, before jumping to this conclusion. In many cases, it
is important to webize nested objects completely. For example, in a 3d scene
language, an object may be within a scene within an object within a scene and
still have identity which is important to be able to refer to. In a hypertext
document, there is a nested context which for example affects the style, and
the reference is made to the destination anchor not as a isolated piece of
hypertext, but in the context of the whole document. </p>

<p>The principle that on the Web, anything must be able to say anything about
anything means that these innermost nested objects must have URIs. </p>

<p>It may also be the case that an attempt to webize a language reveals bad
points in the design which really need to be ironed out anyway for the cause
of good software engineering. If a name in some module has in fact quite
different meanings when used in different contexts, then it isn't suitable
for webizing as it is, and maybe two separate derived URIs should be made in
the mapping. Maybe the language should actually be cleaned up so that the
concepts are distinct. </p>

<p>A very simple case is in a documentation control system, when humans use
the same document name ("the pipe size draft") to refer to a particular
document and also to the set of documents from </p>

<p>An exercise for the reader is to contemplate and determine whether it is
webized, and if not, what it would take, and what would be the cleanest way
of going it. Try looking at XML schemas (what is the URI of an element
type?). </p>

<p>When stuck, recourse to common sense. Ask what the construct actually
represents in a global context, if anything. This might mean clarifying the
language itself. </p>

<h2><a name="Conclusion" id="Conclusion">Conclusion</a> </h2>

<p>Webizing a language involves turning from a system which assumes a closed
world to one which will operate as part of the open web. Some cases are
easier than others. Webizing one application gets one a good idea of what
sorts of design decisions force a closed world assumption and make webizing
difficult, and what by contrast makes a weblike application which immediately
benefits from the rest of everything out there. </p>
]]></description>
  </item>
    
    <item>
      <pubDate>Sun, 19 Jan 2025 00:00:00 GMT</pubDate>
  <title>Charlie Works</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/Works.html</link>
    <guid>https://www.w3.org/DesignIssues/Works.html</guid>
      <description><![CDATA[
    <p>We describe the development of “Charlie,” a AI assistant
    working for the user by leveraging personal data
    securely stored in a Solid Pod. Building on a 2017 <a href="Charlie.html">
    proposal</a> for
    a trustworthy and user-centric AI, the work highlights progress
    made by engineers at Inrupt in late 2024. Using a simulated
    dataset for a fictional user, “Bob,” the team integrated an
    advanced large language model (LLM) with Bob’s personal data to
    demonstrate the transformative potential of combining AI and
    SoLiD pods.</p>
    <p>By accessing rich, structured, and personalized data,
    Charlie provided responses far superior to generic AI systems,
    as shown in a use case involving running shoe recommendations
    tailored to Bob’s fitness and lifestyle data. This
    personalization exemplifies the next level of AI capability,
    offering unprecedented usefulness while maintaining user
    trust.</p>
    <p>We underscore the critical role of robust data
    infrastructure, including Solid Pods and the semantic web, in
    driving AI systems, and the dual role of AI in both populating
    and leveraging linked data stores, paving the way for a dynamic
    ecosystem where data graphs mediate AI interactions. We paint
    an exciting vision for integrating personal data, semantic web
    principles, and advanced AI to create tools that truly serve
    users.</p>
  
<p><a href="https://www.w3.org/DesignIssues/Works.html">Read whole article...</a></p>
]]></description>
  </item>
    
    <item>
      <pubDate>Tue, 01 Jan 2002 00:00:00 GMT</pubDate>
  <title>The Interpretation of an XML document</title>
    <author>timbl@w3.org (Tim Berners-Lee)</author>
    <link>https://www.w3.org/DesignIssues/XML.html</link>
    <guid>https://www.w3.org/DesignIssues/XML.html</guid>
      <description><![CDATA[
    <h1>
      The Interpretation of XML documents
    </h1>
    <h3>
      Abstract:
    </h3>
    <p>
      It might seem that the specifications of different XML
      namespaces can make inconsistent claims such that the
      semantics of a mixed namespace documents are inconsistent.
      The solution sometimes proposed is a "processing model
      language" such that there is no default meaning of an XML
      document without such an external processing definition. This
      article argues that there is only one basic generic
      processing model (or rather, algorithm for allocating
      semantics) for XML documents which preserves needed
      properties of a multi-namespace system. These properties
      include the need to be able to define the semantics of an XML
      element without clashes of different specifications. This
      introduces the concept of an on of an XML document is defined
      starting at the document root by the specifications of the
      element types involved. A common class of foreign element
      name, called here <em>XML function</em>, has to be recognized
      in default processing by any supporting application, and
      returns more XML when it is elaborated.
    </p>
    <h2>
      <a name="problem" id="problem">The problem</a>
    </h2>
    <p>
      If one party sends another an XML document, how does one say
      what it means? Or, if you don't like the <em>meaning</em>
      word, what specs are invoked in what way when an XML document
      is published or transmitted? This question is sometimes posed
      as: What are is the processing model for XML?
    </p>
    <p>
      The interpretation of a plain XHTML document is fairly well
      understood. The document itself is a human language document,
      and so the conventions - sloppy and wonderful - of human
      language define how it is understood and interpreted. And the
      interpretation of tags such as H1 is described in a
      well-thumbed standard and many books, and is implemented more
      or less consistently in many devices.
    </p>
    <p>
      But what happens when we start to mix namespaces? When SVG is
      embedded within XHTML, or RDF or XSLT for that matter, what
      are the rules which ensure that the receiver will understand
      well the intent, the client software do the right thing --
      and the person understand the right thing? The same issues
      obviously apply when the information has machine-readable
      semantics.
    </p>
    <p>
      As Paul Prescod <a href="http://lists.w3.org/Archives/Public/www-tag/2002Feb/0123.html">
      points out</a>, there are plenty of places one might think of
      looking for information about how to process a document:
    </p>
    <ol>
      <li>DOCTYPE statement
      </li>
      <li>top-level namespace
      </li>
      <li>schema reference declaration
      </li>
      <li>other root-level declared namespaces
      </li>
      <li>any attribute on the root element
      </li>
      <li>anything in the document
      </li>
    </ol>
    <p>
      In fact the general problem is that without any overall
      architecture, one can write specs which battle each other.
      "The X attribute changes the meaning of the Y attribute",
      "The Z attribute restores the meaning of the X attribute
      irrespective of any Y attribute" and so on. In such a world,
      one would never know whether one had correctly interpreted
      anything, as there might be somewhere something deemed to
      change the meaning of what we have. Clearly this way lies
      chaos. A coherent architecture defines which specs to look at
      to determine the interpretation of a document. We don't have
      this yet (2002) for XML.
    </p>
    <p>
      However, in practice if a person were to look at a document
      with a mixture of XHTML and SVG, they would probably find its
      meaning unambiguous.
    </p>
    <p>
      In the same message, Paul opines, <em>Top-down
      self-descriptiveness is one of the major advantages of XML
      and I think that doing otherwise should be deprecated</em>. I
      completely agree with this conclusion. He concludes correctly
      that the root namespace (the namespace of the document
      element) [or a DOCTYPE, which I will not discuss further] is
      the only thing one must be able to dispatch on.
    </p>
    <h3>
      <a name="pipeline" id="pipeline"></a>The Pipeline Processing
      model
    </h3>
    <p>
      However, he secondarily concludes that, because it is
      important to define what processing to be done first, one
      should use wrapper elements, so that if there are any XSLT
      elements within a document, a wrapper causes XSLT processing
      to be done, and so on. The discussion about documents with
      more than one namespace has often made an implict assumption
      that the XML is to be processed in a pipeline, in which each
      stage implements one XML technology, such as include
      processing, style sheet processing, decryption, and so on.
      &nbsp;The point of this article is that &nbsp;while this
      works in simple cases, in the general case the pipeline model
      is basically broken. &nbsp;Once you have things arbitraryily
      nested inside each other, there is no single pipeline which
      will do a general case. &nbsp;And nesting things inside each
      other in arbitrary ways is core to the power of XML.
    </p>
    <h3>
      <a name="Specific" id="Specific">Specific cases: XML
      functions</a>
    </h3>
    <p>
      The pipline model makes it very messy to address a situation
      which is increasingly common. This is of an XML document
      which contains a large numbers of embedded elements from
      namespaces such as
    </p>
    <ul>
      <li>XSLT, in "<a href="http://www.w3.org/TR/1999/REC-xslt-19991116#result-element-stylesheet">Literal
      Result Element as Stylesheet</a>" mode.
      </li>
      <li>XInclude
      </li>
      <li>XMLEncryption
      </li>
      <li>XQuery (?)
      </li>
      <li>Internationalization tags such as "do not translate this
      phrase when translating this document"
      </li>
    </ul>
    <p>
      These namespaces share common properties:
    </p>
    <ul>
      <li>They are the sort of thing you want to use with any sort
      of document, without it having to be foreseen in the schema
      for the original document
      </li>
      <li>The content of these elements is not the final form, but
      will be replaced with other content
      </li>
      <li>The resulting content may recursively have invocations of
      the same or different things from the list
      </li>
      <li>The effect of processing the element in this namespace is
      constrained such that it can only elaborate the contents of
      that branch of the tree. The element is replaced with its
      result of processing, but none of its ancestors or siblings
      may be affected.
      </li>
      <li>There are certain very special cases in which you want to
      be able to mention one without it being expanded.
      </li>
    </ul>
    <p>
      To treat these as a group, I will call these elements
      <strong>XML functions</strong>. The term is not picked
      randomly. Let's look at some examples, each of which has its
      peculiarities.
    </p>
    <h4>
      <a name="XSLT" id="XSLT"></a>XSLT Literal Result Element as
      Stylesheet (LRES)
    </h4>
    <p>
      Let me clarify this way of looking at XSLT. The XSLT spec
      defines an XSLT namespace and how you make an XSLT document
      (stylesheet) out of it. Normally, the style sheet has
      <span style="font-family: monospace;">xsl:stylesheet</span>
      as its document element. However, there is a special "Literal
      result element as Stylesheet" (LRES) form of XSLT, in which a
      template document in a target namespace (such as XHTML) has
      XSLT embedded in it only at specific places. &nbsp;Here is an
      example from the spec.
    </p>
    <pre>&lt;html xsl:version="1.0"<br>      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"<br>      xmlns="http://www.w3.org/TR/xhtml1/strict"&gt;<br>  &lt;head&gt;<br>    &lt;title&gt;Expense Report Summary&lt;/title&gt;<br>  &lt;/head&gt;<br>  &lt;body&gt;<br>    &lt;p&gt;Total Amount: &lt;xsl:value-of select="expense-report/total"/&gt;&lt;/p&gt;<br>  &lt;/body&gt;<br>&lt;/html&gt;
</pre>
    <p>
      The XSLT spec formally defines the LRES form as an
      abbreviation for the full form. In doing so it loses the
      valuable fact that in the LRES form, XSLT elements behave as
      XML functions. They actually adhere to the constraints above.
      This is is very valuable. The XSL spec says that the
      interpretation be that an XSLT document be generated and
      processed to return the "real" document. However, this does
      not scale in design terms. As the XSLT specification itsels
      notes,
    </p>
    <p style="margin-left: 40px;">
      "In some situations, the only way that a system can recognize
      that an XML document needs to be processed by an XSLT
      processor as an XSLT stylesheet is by examining the XML
      document itself. Using the simplified syntax makes this
      harder.<br>
      <br>
      NOTE: For example, another XML language (AXL) might also use
      an axl:version on the document element to indicate that an
      XML document was an AXL document that required processing by
      an AXL processor; if a document had both an axl:version
      attribute and an xsl:version attribute, it would be unclear
      whether the document should be processed by an XSLT processor
      or an AXL processor.<br>
      Therefore, the simplified syntax should not be used for XSLT
      stylesheets that may be used in such a situation"
    </p>
    <p>
      It does not work when other namespaces use the same trick. It
      also prevents applications from using optimizations which
      result from the constraints above. So, while the spec
      formally defines a template document in that way, one can
      make, it seems, a completely equivalent definition in terms
      of XML functions.
    </p>
    <p>
      Imagine a document in which at various different parts of the
      tree different forms occur, and in which these xml functions
      are in fact nested: you resolve an XInclude and it returns
      something with XSLT bits in.
    </p>
    <p>
      It is essential primarily to define what such a document
      should actually be when (for example) presented to a user. It
      is an extra plus to have some visibility from outside the
      document as to what functionality will be necessary to have
      to fully process the document, such as from the MIME header,
      but we can get to that later.
    </p>
    <h4>
      <a name="XInclude" id="XInclude"></a>XInclude
    </h4>
    <p>
      This is probably a simple function. The include element is
      replaced by the referenced document or part of document. This
      is straightforward and obviously nests.
    </p>
    <p>
      It is also obvious that it doesn't actually matter , when
      xincludes are nested, that it doesn't make any difference
      whether you consider the inner ones to be expanded before or
      after the outer ones. (The base URI of a reference always has
      to be taken as that of the original source document, no
      matter where the refernce ends up being expanded)
    </p>
    <h2>
      <a name="Processing" id="Processing">Top-down Processing
      model</a>
    </h2>
    <p>
      I think that the battle over the order of processing of XML
      functions is often an ill-formed question. XML is a tree. It
      is appropriate for the interpretation of the tree to be
      defined from the top down. This does not determine the order
      in which the leaves of the tree have to be done.
    </p>
    <p>
      Here are some ways in which processors could handle an XHTML
      document containing XML functions:
    </p>
    <ul>
      <li>Noting that XHTML is a plain vanilla language, but that
      this document contains other things, first pipeline it
      through an XSLT processor, then an XInclude processor (the
      order being arbitrary), then a an XML decryption processor,
      and again in a cycle, until there are no functions left.
      </li>
      <li>Invoke an XML support class which then parses the
      document recursively. This more powerful XML parser has the
      ability to dispatch to the support class for an XML function
      whenever it finds one.
      </li>
      <li>Invoke an XHTML support class which then parses the
      document as it needs to in order to display it.. This more
      powerful XML parser has the ability to dispatch to the
      support class for an XML function whenever it finds one.
      However, the XHTML parser uses the constraint that in certain
      cases the front of an XHTML document can be displayed before
      the last has been parsed, and it actually delays evaluation
      of functions until the user's use of scroll keys makes it
      necessary. It turns out that certain things never need to be
      evaluated at all, saving time and bandwidth.
      </li>
    </ul>
    <p>
      This is NOT supposed to be a definitive list of ways of
      parsing XML documents with functions - it is only supposed to
      illustrate the fact that many approaches are possible which
      can be shown to be mathematically equivalent in their effect.
      (This is why I tend to talk about the meaning, or
      interpretation, of a document, rather than the processing
      model)
    </p>
    <h3 id="need">
      <a name="quote" id="quote"></a>The need to quote
    </h3>
    <p>
      That said, it may be necessary to define a reference
      processing model, just because one has to have a way of
      describing what the document means. In this case note that
      the first model above is not appropriate. It uses the fact
      that XHTML contains no tricks - it is "plain vanilla" in that
      everything in the document is part of the document in the
      same way, modulo styling. (I simplify). This does not apply
      to other sorts of document. Take an XML package for example:
      the contents of the packages are quoted and is not
      appropriate just to expand the contents of them. Only the
      cover note, the defining document contains the import of the
      package as a whole, and the interpretation of the other
      packaged things is only known in as much as the cover note
      defines it. it is essential that languages such as XML
      packaging can be defined in XML. It is essential that one
      can, if you like, quote a bit of XML literally, and make up a
      new tag which says something quite new about the contents.
      Therefore, while it works with XHTML, and as Tim Bray says
      (TAG 2002/02/14) there are many applications which do
      "generic XML processing" such as trawling documents for links
      and use of language, there will be certain namespaces such as
      HTML and SVG for which that makes sense and and other such as
      XML packaging and Xml encryption, in which it won't. <em>(On
      the semantic web case, the same applies, and was the cause in
      2002 of much discussion in RDF groups because RDF does not
      have quotes, and the informal use of
      rdf:parseType="log:quote")</em>
    </p>
    <p>
      If you need another example, think about the XSLT insert
      which generates and XInclude element: It may contain what
      seems to be and even is an XInclude element, but should not
      be expanded as contents of the XSLT element.
    </p>
    <p>
      The reference processing model must be then, that parsing of
      an XML document conceptually involves elaboration of any
      functions, and that processors must be able to dispatch based
      on namespace at any point in the tree.
    </p>
    <p>
      The result of such processing is the document which should
      correspond to the XML schema, if the. There is normally no
      call for schema validation of the document which still
      contains XML functions. Systems which claim to be conformant
      to the spec of a given XML function mean that they can, in
      any XML document, elaborate the function according to the
      specification. As Jacek Kopecky says (2002/02/21), <em>[...]
      by saying on the sender: "We expect the XHTML processor to be
      able to handle XInclude and therefore this thing is an XHTML
      document all right"</em>. We can't of course expect old XML
      processors to handle XInclude, but we can expect anything
      which claims conformance with Xinclude to do so.
    </p>
    <h3>
      <a name="Software" id="Software"></a>Software designs for
      top-down processing
    </h3>
    <p>
      In object-oriented software terms, one imagines handing an
      XML tree to an instance of an object class which supports the
      element type of the document element. This then returns
      something as defined by the spec. (An HTML document
      conceptually returns a hypertext page, an SVG document a
      diagram, an RDF document a conceptual graph (small c small
      g)). The object may itself call out to a parser to get the
      infoset for its contents, and it may or may not call out to
      the XML function evaluator but whether it does or not is
      defined by its own specification. But XML functions just
      return XML which replaces them. And any XML applications
      which claim conformance to the XML function's spec should be
      able to accept this.
    </p>Similarly, in an event-oriented architecture, an event
    stream which is being fed to an HTML handler would, when a
    foreign namespace such as XSLT is found, be vectored to an XSLT
    handler. The software design has to allow the XSLT handler to
    hand back a new set of events, a serialization of the resultant
    tree, to the HTML handler.<br>
    <br>
    The software design in either vase also has to allow enough
    context to be shared between the applications so that they can
    perform their function: embedded SVG needs a display context
    such as part of a drawing space which corresponds to the space
    in the rendering of the HTML document, and so on.<br>
    <h3>
      <a name="siblings" id="siblings"></a>Unresolved issue:
      references to siblings
    </h3>
    <p>
      This note does not address many of the issues around the XML
      processing model.
    </p>
    <p>
      There is a possible ambiguity when a function refers to the
      current document. In other words, though it is not allowed to
      change things outside itself, it may read outside itself.
      This (if allowed) would clear raise the question of whether
      it references the document before or after its own or other
      function's elaboration.
    </p>
    <p>
      A related question is whether an XPointer fragment identifier
      should refer to the document before or after elaboration of
      functions. My inclination is to say after, because then you
      know that an XPointer into an SVG object will resolve to a
      bit of SVG. But there may be arguments the other way.
    </p>
    <p>
      XML Digital Signature (I am told) specifically requires that
      the signature is done on the raw source of the document
      before XInclude. Without going into the relative merits of
      signature before and after XInclude and other functions, it
      is clear that there are cases when either would be useful.
    </p>
    <p>
      The ambiguity of these references, like the problems in XSLT
      of generating XSLT stylesheets with XSLT stylesheets, stem
      from the lack of quoting syntax in XML.
    </p>
    <h3>
      <a name="MIME" id="MIME">MIME content-type labeling</a>
    </h3>
    <p>
      <em>@@This section is not complete. It has been covered more
      thoroughly by TAG discussions already. @@ link</em>
    </p>
    <p>
      An XML document is in most cases self-describing. That is,
      you don't need to know anything more that it is XML to
      interpret it. In email and HTTP applications, it is useful
      for the RFC822-style message to define how the body should be
      interpreted using the <code>content-type</code> header. All
      that is necessary, then, is that the content-type should
      indicate XML (<code>text/xml</code> or
      <code>application/xml</code> or anything with
      <code>+xml)</code> and a top-down generic processing is
      valid. (The algorithm for determining the character encoding
      is not addressed here @@ link)
    </p>
    <p>
      While this is sufficient, it is however useful to be able to
      provide more visibility as to what is contains [Roy Fielding,
      Dissertation, Ch4 @@link]. The document element gives, in
      many cases, fundamental information about the resulting type
      of the whole document, irrespective of functions elaborated
      or plugins plugged in. For example, whatever the content, an
      <code>xhtml:html</code> document is a hypertext page. This
      means that some systems will represent it in a window and
      allows certain functionality. The operating system, if it
      knows this, can use icons to tell the user, before they open
      an email or follow a link, what sort of a thing it contains
      or leads to. Similarly, an SVG document will return a
      diagram, and an RDF document body of knowledge -- a set of
      relational data. So more than any other namespace used in the
      document, the document element namespace is crucial.
    </p>
    <p>
      This is why the best practice is to publish documents with
      standard and therefore well-known document element types as a
      special MIME type. This allows an XHTML page to be visible as
      such from the HTTP headers alone. This allows smarter
      processing by intermediates, decisions about proxy caching,
      translation, and so on. It allows the content negotiation of
      HTTP to operate, allowing a user for example to express a
      preference for audio rather than video. This also allows
      systems which want to to optimize the dispatching of a
      handler for the document from the MIME type alone. A "+xml"
      prefix as defined by RFC____@@ should be used whenever the
      document is also a self-describing top-down XML document for
      which the top-down processing model applies. (The fact that a
      document is a well-formed XML1.0 document alone does
      <em>not</em> constitute grounds for adding the "+xml")
    </p>
    <p>
      Simon St Laurent has suggested [@@ his Internet-draft,
      possibly timed out] that all namespaces used in the document
      be listed as parameters to the MIME type. This makes sense on
      the surface. It may not be practical or worth the effort. It
      is a lot of bits, and in any case exactly what will be
      required cannot be determined until the document has been
      interpreted top-down. However, it or something equivalent is
      necessary if one is to specify the software support which is
      necessary.
    </p>
    <ul>
      <li>The top element can in fact be such that the other
      elements are to interpreted in arbitrarily weird ways
      </li>
      <li>For many document element types, there is a guarantee of
      the sort of object which is being represented.
      </li>
    </ul>So the best form of visibility would be state (and
    possibly negotiate) the set of XML deatures which must be
    supported to properly process the document.
    <h3>
      <a name="implied" id="implied"></a>Related notion: implied
      namespace
    </h3>
    <p>
      When a namespace-specific content-type has been specified, is
      it also necessary to specify the document namespace, or could
      that be assumed? That would mean that a plain XHTML file
      would not need an explict namespace. It is tempting to say
      that the default namespace should default to that associated
      with the content type, but in fact the logical thing is for
      the document namespace.
    </p>
    <p>
      @@Decision tree diagram - add
    </p>
    <h3>
      <a name="L1578" id="L1578">User-defined processing of
      documents</a>
    </h3>
    <p>
      This document defines the basic interpretation of an XML
      document. There have been many suggestions of ways in which a
      complex and different order of processing could be specified,
      many of these mentioned at the workshop, and including Sun's
      XML pipeline submission. My current view is that such
      languages should be regarded themselves the top-level
      document which then draws in the rest of the document by
      reference as it is elaborated.
    </p>
    <h3>
      <a name="L1610" id="L1610">Server-side processing of
      documents</a>
    </h3>
    <p>
      In the HTTP protocol, or email for that matter, the important
      interface which is standardized is the one between the
      publisher (or sender) and receiver. We concern ourselves with
      what a receiver can do by way of interpretation of an XML
      document published or sent. Any processing which has happened
      on the server or sender side in order to process that
      document is not part of the protocol. While XML functions may
      indeed be elaborated to form a document for transmission from
      another one, that is something for control within the server
      and so is not a primary concern for standardization.
    </p>
    <p>
      When a document is in a pure fucntional form, it actually is
      an opmization whether the functions are elaborated by the
      server or or the client.
    </p>
    <h2>
      <a name="schema" id="schema"></a>The requirements on Schema
      languages
    </h2>This tree-oriented architecture for XML puts requirements
    on schema languages. With DTDs, and with current XML schema,
    there was no natural way to describe how namespaces fit
    together. There have been many rather unnatural attempts to
    create a modular system, such as the HTML modularization
    @@link. The way this has been done has basically been to make
    one great big schema for the combined language, in such a way
    that the new schema constrains the way the elements from
    different namespaces can fit together.<br>
    The problem is to avoid making this an n<sup>2</sup> problem.
    Will the working group which integrates n specs (such as 4 for
    XHTML, SVG, XForms, MathML) take n<sup>2</sup> years to make
    the schema? It would be far preferable if one could just write
    a scheme for each new facility,<br>
    <br>
    Conversely, what would a schema language would have to allow us
    to say:<br>
    <ul>
      <li>
        <span style="font-family: monospace;">&lt;its:info
        translate="no"&gt;<span style="font-style: italic;">x</span>&lt;/its:info&gt;</span> can
        occur anywhere <span style="font-family: monospace; font-style: italic;">x</span> can
        occur (for systems which support ITS)
      </li>
      <li>&lt;its:info translate="no"&gt;x&lt;/its:info&gt; can
      occur anywhere x can occur, so long as x is human-presentable
      content.
      </li>
      <li>&lt;xenc:encrypted&gt; ...&lt;/&gt; is a function: it can
      occur anywhere, so long as XEnc is supported. It's processing
      will return XML mixed content which will replace this
      element.
      </li>
      <li>&lt;svg:drawing/&gt; can occur anywhere &lt;xhtml:img
      /&gt; can occur.
      </li>
    </ul>This way of specifying n independent schemas, or rather
    schemas which have back-references to earlier schemas in some
    cases, allows a product to simply quote the set of XML
    technologies which it supports. This has to be negotiated
    between the sender and receiver of XML. It is not the same in
    the general case to the set of namespaces used in the document,
    because function elaboration may change that. All the same, the
    namespaces may be a useful way of indirectly referring to the
    features.
    <p>
      Because the mode of operation in which the content is
      evaluated with function processing is very common, it would
      be useful in a schema for example to indicate this mode, or,
      more practically, to indicate the exceptions. There are very
      few elements which don't elaborate their contents at the
      moment in the markup world, and so they should be the
      exception. (Many computing languages of course reserve
      special punctuation for this quoting but adding punctuation
      at this stage isn't the XML style!)
    </p>
    <h2>
      <a name="Conclusion" id="Conclusion"></a>Conclusion
    </h2>
    <p>
      The top-down processing model for XML as an architectual
      principle resolves many of the questions which remain
      unanswerable with pipelined processing. In fact,
      consideration of the example shows that pipeline processing
      could be actually dangerious, producing errors and possibly
      security issues, in the case of generally nested XML
      technologies of the types discussed.
    </p>
    <h2>
      <a name="References" id="References">References</a>
    </h2>
    <ul>
      <li>Discussion on www-tag@w3.org list
        <ul>
          <li>
            <a href="http://lists.w3.org/Archives/Public/www-tag/2002Feb/0123.html">
            19 Feb 2002, Paul Prescod Namespace Dispatching</a>
          </li>
        </ul>
      </li>
      <li>
        <a href="http://www.imc.org/ietf-xml-mime/mail-archive/threads.html">
        The archive of the XML-MIME list relevant</a> to MIME
        dispatching of XML documents
      </li>
      <li>
        <a href="http://www.w3.org/XML/2001/07/XMLPM.html">XML
        processing model workshop</a>
      </li>
      <li>TAG Issue <a href="http://www.w3.org/2001/tag/issues.html#xmlFunctions-34">XMLFunctions-34</a>
      </li>
      <li>W3C Specifications: <a href="http://www.w3.org/TR/REC-xml/">XML spec</a>; <a href="http://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a>
      </li>
      <li>
        <a href="http://www.w3.org/TR/2002/NOTE-xml-pipeline-20020228/">XML
        pipeline definition language</a>, Sun Microsystems
      </li>
      <li>
        <a href="http://lists.w3.org/Archives/Member/xml-pm-ws/2001Jul/thread.html">
        XML Processing Model discussion list</a> (W3C members
        archive)
      </li>
    </ul>
    ]]></description>
  </item>
    </channel>
  </rss>
  