<?xml version="1.0"?>
<rss version="2.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007">
   <channel>
      <title>Semantic Web pipe</title>
      <description>What&amp;#39;s going on in the Semantic Web?</description>
      <link>http://pipes.yahoo.com/pipes/pipe.info?_id=JsAx1rN13BGzogf56UjTQA</link>
      <pubDate>Sat, 21 Nov 2009 13:58:16 -0800</pubDate>
      <generator>http://pipes.yahoo.com/pipes/</generator>
      <item>
         <title>Semantic Web Demozone</title>
         <link>http://leobard.twoday.net/stories/6053084/</link>
         <description>If someone asks me &quot;what is the semantic web&quot;, I have a new answer: go look into the demozone.&lt;br /&gt;
&lt;br /&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://demozone.semantic-web.at/&quot;&gt;http://demozone.semantic-web.at/&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://demozone.semantic-web.at/&quot;&gt;&lt;img src=&quot;http://demozone.semantic-web.at/images/logo.gif&quot; alt=&quot;demozone&quot;/&gt;&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
Gratulations to the team of the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semantic-web.at/&quot;&gt;semantic web company&lt;/a&gt; for launching this, I see these guys now as the world premier semantic web consulting agency: in a vendor-neutral way, they show what semantics you can get.&lt;br /&gt;
&lt;br /&gt;
Before this, I only had the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.w3.org/2001/sw/sweo/public/UseCases/&quot;&gt;W3C SWEO use case collection&lt;/a&gt;, now I have two answers. Good work!&lt;br /&gt;
&lt;br /&gt;
(early adopters: yes, its out there since some weeks now, but I still think we should blog about it)</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_9cf25c0aa99907d3b2342a0f42ddf923</guid>
         <pubDate>Fri, 20 Nov 2009 10:28:00 -0800</pubDate>
      </item>
      <item>
         <title>Scaling Up at the Tetherless World Constellation in 2009</title>
         <link>http://tw.rpi.edu/weblog/2009/11/20/scaling-up-at-twc-2009/</link>
         <description>Since this is my first post to the Tetherless World blog, perhaps a brief introduction is in order. I&amp;#8217;m Jesse Weaver, one of Jim Hendler&amp;#8217;s Ph.D. students in the Tetherless World Constellation (TWC) at Rensselaer Polytechnic Institute (RPI). My general research interest is in high-performance computing for the semantic web. Specifically, I have been looking [...]</description>
         <guid isPermaLink="false">http://tw.rpi.edu/weblog/?p=219</guid>
         <pubDate>Fri, 20 Nov 2009 14:15:50 -0800</pubDate>
         <content:encoded><![CDATA[<p>Since this is my first post to the Tetherless World blog, perhaps a brief introduction is in order. I&#8217;m <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/">Jesse Weaver</a>, one of <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~hendler/">Jim Hendler</a>&#8217;s Ph.D. students in the <a rel="nofollow" target="_blank" href="http://tw.rpi.edu/">Tetherless World Constellation</a> (TWC) at <a rel="nofollow" target="_blank" href="http://www.rpi.edu/">Rensselaer Polytechnic Institute</a> (RPI). My general research interest is in high-performance computing for the semantic web. Specifically, I have been looking at employing parallelism on cluster architectures for rule-based reasoning and RDF query. Since joining TWC in Fall 2008, I have been working with colleagues toward this end, and it is that work that I would like to share in this blog post.</p>
<p>Jim and I recently published a paper at ISWC 2009 entitled <em><a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/#ParallelRDFS">Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples</a></em>. Since the time that paper was accepted (and as presented at ISWC), we have actually scaled to billions of triples. We show in this paper that the RDFS rules can be applied to independent partitions of data to produce the RDFS closure for all of the data, as long as each partition has the ontologies. In parallel computing terms, the RDFS closure can be computed in an embarrassingly parallel fashion. &#8220;Embarrassingly parallel&#8221; is a technical term from parallel computing describing a computation that can be divided into completely independent parts. Such computations are considered ideal for parallelism because there is no need for communication between processes and hence there is essentially no overhead for parallelization. <a rel="nofollow" target="_blank" href="http://ect.bell-labs.com/who/pfps/">Peter Patel-Schneider</a> had some good questions and comments after the presentation. I have made my responses publicly available in a brief <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/papers/iswc2009-notes.txt">note</a>.</p>
<p><a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~willig4/">Gregory Todd Williams</a> and I published a paper at SSWS 2009 entitled <em><a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/#ScalableRDFQuery">Scalable RDF query processing on clusters and supercomputers</a></em>. This paper shows how parallel hash joins can be used on high-performance clusters to efficiently query large RDF datasets. It seemed to get a lot of attention at the SSWS workshop as well as stir up a little bit of controversy. The interesting thing about our approach is that no global indexes are created. Each process in the cluster gets a portion of the data and indexes it locally, but no global indexes are maintained (e.g., we do not globally dictionary encode RDF terms). This allows us to load data extremely quickly with some cost to query time. In many cases, though, the decrease in loading time outweighs the added cost in query time. (The added cost in query time comes from communicating full string values instead of global IDs during the parallel hash join.) This allows for exploratory querying and easy handling of dynamically changing data. Whereas many previous query systems depend heavily on global indexes (for which loading can take on the order of hours or days), we can load large datasets on the order of seconds and minutes. Therefore, if the data changes, it can just be reloaded instead of updating indexes.</p>
<p>Finally, Greg, <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~atrem/">Medha Atre</a>, Jim, and I submitted a paper to the <a rel="nofollow" target="_blank" href="http://www.cs.vu.nl/~pmika/swc/submissions2009.html">Billion Triples Challenge</a> (BTC), which we won!</p>
<div style="text-align:center;"><img src="http://www.cs.rpi.edu/~weavej3/btc2009/Btc-winner-2009.png" alt="Greg and Jesse accept the award for 1st place at the 2009 Billion Triples Challenge" width="300" height="225"/></div>
<p>We composed together three systems for our submission. First, we created a simple <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/btc2009/#upper">upper ontology</a> of 31 triples for our domain of interest, linking established concepts of Person to our concept of Person (by subclass), and we did the same for many relevant properties (name, email, etc.) (by subproperty). Then, we used the aforementioned parallel materialization work to produce inferences on the BTC dataset, inferring triples that use our terms from the upper ontology. Using the aforementioned work on scalable query, we then extracted only our triples of interest. This <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/btc2009/#reduced">reduced dataset</a> is almost 800K triples, much more manageable than the original 900M triples, and it can now be used by existing tools without much concern of dataset size. As a finishing touch, we compressed the reduced dataset down into a <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~atrem/#Research">BitMat</a> RDF data structure, resulting in a final disk space of 8 MB for the triples and 25 MB for the dictionary encoding. Simple basic graph pattern queries can be executed against the BitMat. The entire process took roughly 22 minutes. See more about the submission at <a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/btc2009/">our BTC website</a> which contains the datasets and some statistics about the datasets.</p>
<div style="text-align:center;"><a rel="nofollow" title="Jesse's BTC Presentation by kasei, on Flickr" target="_blank" href="http://www.flickr.com/photos/kasei/4055714142/"><img src="http://farm3.static.flickr.com/2466/4055714142_de5795153f.jpg" alt="Jesse presents at the 2009 Billion Triples Challenge" width="333" height="222"/></a></div>
<p>That being said, the future holds much work to be done for scalability in the semantic web domain.</p>
<p>At present, I have been looking at formalizing a more general notion of &#8220;abox partitioning&#8221; for the purpose of classifying rules that fit such a paradigm, and then explore its application to OWL2RL. Some parts of OWL2RL&#8212;like symmetric properties and inverse properties&#8212;clearly fit in the inferencing scheme from the parallel materialization paper. However, many of the much desired features&#8212;like inverse functional properties and owl:sameAs&#8212;do not. For such rules, parallel hash joins may be needed, or perhaps a more clever partitioning scheme.</p>
<p>We could also improve loading time of these systems (and perhaps communication time during parallel hash joins) by using an RDF syntax that is less verbose than <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/rdf-testcases/#ntriples">N-Triples</a>, but not as complex as <a rel="nofollow" target="_blank" href="http://www.w3.org/TeamSubmission/turtle/">Turtle</a>. (Remember, we are concerned about <strong>parallel</strong> I/O.) To that end, we are exploring defining a subset of Turtle that would be helpful for I/O purposes without trading off the inherent simplicity of N-Triples (one triple per line).</p>
<p>We would also like to start employing more memory-efficient RDF storage data structures (like BitMat or Parliament) directly in our systems. This is particularly important for the Blue Gene/L architecture which has at most 1 GB of memory per node.</p>
<p>And speaking of the Blue Gene/L, I have been doing all my work at RPI&#8217;s fabulous <a rel="nofollow" target="_blank" href="http://www.rpi.edu/research/ccni/">Computational Center for Nanotechnology Innovations</a> (CCNI). The CCNI is really a great computation facility having parallel file systems, high performance clusters, large SMP machines, and&#8212;of course&#8212;a Blue Gene/L. Such a resource is a great enabler for our research.</p>
<p>Jesse Weaver<br />
Ph.D. Student, Patroon Fellow<br />
Tetherless World Constellation<br />
Rensselaer Polytechnic Institute<br />
<a rel="nofollow" target="_blank" href="http://www.cs.rpi.edu/~weavej3/">http://www.cs.rpi.edu/~weavej3/</a></p>]]></content:encoded>
         <category>tetherless world</category>
      </item>
      <item>
         <title>This Week at DERI</title>
         <link>http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=594</link>
         <description>&lt;h4&gt;All Infrastructure systems are back online&lt;/h4&gt;
This week, due to excessive rainfall, part of the ground floor of DERI was flooded. As a precaution all electrical equipment on the ground floor has been turned off, including all equipment in the DERI server room. This equipment is now turned back.
Please, accept our apologies for any inconvenience caused.
&lt;h4&gt;Sigma explained in the latest issue of Talis' Nodalities magazine&lt;/h4&gt;
 Sig.ma is developed by Szymon Danielczyk , Richard Cyganiak , Michele Catasta and Giovanni Tummarello . In the latest issue of Talis' Nodalities magazine , Michael Hausenblas and Richard Cyganiak explained Sigma &quot;a visual Web Data aggregation and querying platform targeting entity visualisation and consolidation.&quot;
&lt;h4&gt;Science Spin - Semantic Web, What's it all about? with Prof. Dr.Stefan Decker&lt;/h4&gt;
 Science Spin is broadcast every Thursday afternoon at 3:30pm to 4pm on Dublin City FM, 103.2FM. The show is written and presented by Seán Duke , science writer and editor.
 Stefan Decker , Director of DERI NUI Galway, was interviewed about the semantic web . What it is? How does it work? Why is it important?...
&lt;h4&gt;Survey: How Semantic Web researchers use Web 2.0 to communicate about their work?&lt;/h4&gt;
This survey is established as part of a research MSc. at DERI, NUI Galway. Our aim is to study the habits and motivations of the Semantic Web researchers community to publish and share contents online using Web 2.0 services. Thus, if you are researching Semantic Web technologies, we would really appreciate if you can take the survey .</description>
         <guid isPermaLink="false">http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=594</guid>
         <pubDate>Fri, 20 Nov 2009 09:33:30 -0800</pubDate>
         <category>Awards</category>
      </item>
      <item>
         <title>Converting Word documents to DITA</title>
         <link>http://www.snee.com/bobdc.blog/2009/11/converting-word-documents-to-d.html</link>
         <description>Via OpenOffice and DocBook.</description>
         <guid isPermaLink="false">http://www.snee.com/bobdc.blog/2009/11/converting-word-documents-to-d.html</guid>
         <pubDate>Fri, 20 Nov 2009 06:37:31 -0800</pubDate>
         <category>DITA</category>
      </item>
      <item>
         <title>Twitter API enables geotagging</title>
         <link>http://ebiquity.umbc.edu/blogger/2009/11/20/twitter-api-enables-geotagging/</link>
         <description>Twitter turned on its API for geotagging tweets yesterday, as announce in in a post on their blog, Think Globally, Tweet Locally. Currently, geographic information will only be associated with your tweets if you use an application that adds it and will only be used to display your tweets when viewed with an [...]</description>
         <guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=2713</guid>
         <pubDate>Fri, 20 Nov 2009 05:50:39 -0800</pubDate>
         <content:encoded><![CDATA[<p>Twitter turned on its API for geotagging tweets yesterday, as announce in in a post on their blog, <a rel="nofollow" target="_blank" href="http://blog.twitter.com/2009/11/think-globally-tweet-locally.html">Think Globally, Tweet Locally</a>. Currently, geographic information will only be associated with your tweets if you use an application that adds it and will only be used to display your tweets when viewed with an application that can exploit it. Here&#8217;s the way Twitter described it.</p>
<blockquote><p>
&#8220;This release is unique in that it&#8217;s API-only which means you won&#8217;t see any changes on <a rel="nofollow" target="_blank" href="http://twitter.com/">twitter.com</a>, yet. Instead, Twitter applications like <a rel="nofollow" target="_blank" href="http://birdfeedapp.com/">Birdfeed</a>, <a rel="nofollow" target="_blank" href="http://www.seesmic.com/app">Seesmic Web</a>, <a rel="nofollow" target="_blank" href="http://foursquare.com/">Foursquare</a>, <a rel="nofollow" target="_blank" href="http://gowalla.com/">Gowalla</a>, <a rel="nofollow" target="_blank" href="http://twidroid.com/">Twidroid</a>, <a rel="nofollow" target="_blank" href="http://j.mp/twitpro">Twittelator Pro</a> and others are already supporting this new functionality (go try them out now!) in interesting ways that include geotagging your tweets and displaying the location from where a tweet was posted.&#8221;
</p></blockquote>
<p>Examining Twitter&#8217;s <a rel="nofollow" target="_blank" href="http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-statuses%C2%A0update">status update API </a> description shows how one associates a location with a Tweet. Pretty simple.</p>
<p>Since disclosing your location raises privacy concerns, Twitter has made geotagging an opt-in service and also allows users to delete all of the location information associated with their tweets. Moreover, their policy, as described <a rel="nofollow" target="_blank" href="http://help.twitter.com/forums/26810/entries/78525">here</a>, says</p>
<blockquote><p>
&#8220;We require application developers to be upfront and obvious about when they are Geotagging an update. If you ever find that an application is doing it without notifying you, please let us know.&#8221;
</p></blockquote>
<p>Twitter has updated its <a rel="nofollow" target="_blank" href="http://twitter.com/privacy">privacy policy</a> to cover location information.</p>
<p>You can read more on <a rel="nofollow" target="_blank" href="http://www.readwriteweb.com/archives/twitter_location_api_possible_uses.php">ReadWriteWeb</a> and <a rel="nofollow" target="_blank" href="http://www.techcrunch.com/2009/11/19/twitter-location-api/">Techcrunch</a>.</p>]]></content:encoded>
      </item>
      <item>
         <title>White House to make increasing use of RDFa</title>
         <link>http://rdfa.info/2009/11/20/white-house-planning-to-make-increasing-use-of-rdfa/</link>
         <description>According to Information week, the White House is planning to make increasing use of RDFa. &amp;#8220;We have a lot of primary source content and have it exposed in ways that traditionally hasn&amp;#8217;t been done by government,&amp;#8221; Cole said. &amp;#8220;Instead of just having PDFs that are scanned, we&amp;#8217;re trying to reverse that trend.&amp;#8221;
More here: Obama Team [...]</description>
         <guid isPermaLink="false">http://rdfa.info/?p=186</guid>
         <pubDate>Fri, 20 Nov 2009 01:05:26 -0800</pubDate>
         <content:encoded><![CDATA[<p>According to Information week, the White House is planning to make increasing use of RDFa. &#8220;We have a lot of primary source content and have it exposed in ways that traditionally hasn&#8217;t been done by government,&#8221; Cole said. &#8220;Instead of just having PDFs that are scanned, we&#8217;re trying to reverse that trend.&#8221;</p>
<p>More here: <a rel="nofollow" target="_blank" href="http://www.informationweek.com/news/government/info-management/showArticle.jhtml?articleID=221900361">Obama Team Challenges Web Developers</a></p>]]></content:encoded>
         <category>Usage</category>
      </item>
      <item>
         <title>http://openid4.me/ -- OpenId ♥ foaf+ssl</title>
         <link>http://blogs.sun.com/bblfish/entry/http_openid4_me_openid_foaf</link>
         <description>&lt;p&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://openid4.me/&quot;&gt;OpenId4.me&lt;/a&gt; is the bridge between &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://esw.w3.org/topic/foaf+ssl&quot;&gt;foaf+ssl&lt;/a&gt; and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://openid.net/&quot;&gt;OpenId&lt;/a&gt; we have been waiting for.&lt;/p&gt;
&lt;p&gt;OpenId and foaf+ssl have a lot in common:
&lt;ul&gt;
&lt;li&gt;They both allow one to log into a web site without requiring one to divulge a password to that web site
&lt;li&gt;They both allow one to have a global identifier to log in, so that one does not need to create a username for each web site one wants to identify oneself at.
&lt;li&gt;They also allow one to give more information to the site about oneself, automatically, without requiring one to type that information into the site all over again.
&lt;/ul&gt;
&lt;p&gt;OpenId4.me allows a person with a foaf+ssl profile to automatically login to the millions of web sites that enable authentication with OpenId. The really cool thing is that this person never has to set up an OpenId service. OpenId4.me does not even store any information about that person on it's server: it uses all the information in the users foaf profile and authenticates him with foaf+ssl. OpenId4.me does not yet implement attribute exchange I think, but it should be relatively easy to do (depending on how easy it is to hack the initial OpenId code I suppose).&lt;/p&gt;
&lt;p&gt;If you have a foaf+ssl cert (get one at &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://foaf.me/&quot;&gt;foaf.me&lt;/a&gt;) and are logging into an openid 2 service, all you need to type in the OpenId box is &lt;code&gt;openid4.me&lt;/code&gt;. This will then authenticate you using your foaf+ssl certificate, which works with most existing browsers without change!&lt;/p&gt;
&lt;p&gt;If you then want to own &lt;b&gt;your&lt;/b&gt; OpenId, then just add a little html to your home page. This is what I placed on &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://bblfish.net&quot;&gt;http://bblfish.net/&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt; &amp;lt;link rel=&quot;openid.server&quot; href=&quot;http://openid4.me/index.php&quot; /&amp;gt; &amp;lt;link rel=&quot;openid2.provider openid.server&quot; href=&quot;http://openid4.me/index.php&quot;/&amp;gt; &amp;lt;link rel=&quot;meta&quot; type=&quot;application/rdf+xml&quot; title=&quot;FOAF&quot; href=&quot;http://bblfish.net/people/henry/card%23me&quot;/&amp;gt;
&lt;/pre&gt;
&lt;p&gt;And that's it. Having done that you can then in the future change your openid provider very easily. You could even set up your own OpenId4.me server, as it is open source.&lt;/p&gt;
&lt;p&gt;More info at &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://openid4.me/&quot;&gt;OpenId4.me&lt;/a&gt;.&lt;/p&gt;</description>
         <guid isPermaLink="false">http://blogs.sun.com/bblfish/entry/http_openid4_me_openid_foaf</guid>
         <pubDate>Thu, 19 Nov 2009 10:57:08 -0800</pubDate>
      </item>
      <item>
         <title>Adrian Dale looks forward to Online Information 2009</title>
         <link>http://feedproxy.google.com/~r/Nodalities/~3/iLUPyKYho0I/adrian-dale-looks-forward-to-online-information-2009.php</link>
         <description>The twelve months that have elapsed since the previous Online Information Conference has seen an explosion in technologies that influence the information world and life in general.&amp;#160; What was being talked about as up coming trends last year, are now core to the agenda of this years conference.
Conference Chair, Adrian Dale, joins me [...]</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_9afea498232f768dac3b143b86872096</guid>
         <pubDate>Thu, 19 Nov 2009 01:53:52 -0800</pubDate>
         <content:encoded/>
      </item>
      <item>
         <title>Australia to Join JES &amp; Co.'s Achievement Standards Network - PR Newswire (press release)</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/yumJKdLZybA/industry-news-australia-join-jes-amp-cos-achievement-standards-network-pr-newswire-press-release.htm</link>
         <description>&lt;table cellspacing=&quot;7&quot; cellpadding=&quot;2&quot; border=&quot;0&quot;&gt;&lt;tr&gt;&lt;td align=&quot;center&quot; width=&quot;80&quot; valign=&quot;top&quot;&gt; &lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;div&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=yumJKdLZybA:7cIYH2JCI1s:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=yumJKdLZybA:7cIYH2JCI1s:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=yumJKdLZybA:7cIYH2JCI1s:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/yumJKdLZybA&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4716 at http://www.semanticuniverse.com</guid>
         <pubDate>Thu, 19 Nov 2009 09:05:23 -0800</pubDate>
      </item>
      <item>
         <title>Putting a Conference into the Semantic Web</title>
         <link>http://tomheath.com/blog/2009/11/putting-a-conference-into-the-semantic-web/</link>
         <description>Chris Gutteridge asked this question about semantically enabling conference Web sites, which is a subject close to my heart. It&amp;#8217;s hard to give a meaningful response in 140 characters, so I decided to get some headline thoughts down for posterity. If you want a fuller account of some first-hand experiences, then the following papers are [...] Related posts:&lt;ol&gt;&lt;li&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href='http://tomheath.com/blog/2009/02/linked-data-tutorials-at-semantic-web-austin/' title='Permanent Link: Linked Data Tutorials at Semantic Web Austin'&gt;Linked Data Tutorials at Semantic Web Austin&lt;/a&gt; &lt;small&gt;I spent a few days last week in Austin, Texas,...&lt;/small&gt;&lt;/li&gt;&lt;/ol&gt; Related posts brought to you by &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href='http://mitcho.com/code/yarpp/'&gt;Yet Another Related Posts Plugin&lt;/a&gt;.</description>
         <guid isPermaLink="false">http://tomheath.com/blog/?p=124</guid>
         <pubDate>Thu, 19 Nov 2009 04:42:50 -0800</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" title="Chris Gutteridge" target="_blank" href="http://users.ecs.soton.ac.uk/cjg/">Chris Gutteridge</a> asked <a rel="nofollow" title="this question" target="_blank" href="http://twitter.com/cgutteridge/status/5839876111">this question</a> about semantically enabling conference Web sites, which is a subject close to my heart. It&#8217;s hard to give a meaningful response in 140 characters, so I decided to get some headline thoughts down for posterity. If you want a fuller account of some first-hand experiences, then the following papers are a good place to start:</p>
<ul>
<li>Tom Heath, John Domingue, and Paul Shabajee (2006) <a rel="nofollow" title="User Interaction and Uptake Challenges to Successfully Deploying Semantic Web Technologies" target="_blank" href="http://swui.semanticweb.org/swui06/papers/Heath/Heath.pdf">User Interaction and Uptake Challenges to Successfully Deploying Semantic Web Technologies</a>. In Proceedings of <a rel="nofollow" title="The 3rd International Semantic Web User Interaction Workshop (SWUI2006)" target="_blank" href="http://swui.semanticweb.org/swui06/">The 3rd International Semantic Web User Interaction Workshop (SWUI2006)</a>, 5th International Semantic Web Conference (ISWC2006), November 2006, Athens, GA, USA.</li>
<li>Knud Möller, Tom Heath, Siegfried Handschuh and John Domingue (2007) <a rel="nofollow" title="Recipes for Semantic Web Dog Food - The ESWC and ISWC Metadata Projects" target="_blank" href="http://iswc2007.semanticweb.org/papers/795.pdf">Recipes for Semantic Web Dog Food &#8211; The ESWC and ISWC Metadata Projects</a>. In Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC+ASWC2007), Busan, Korea. LNCS 4825.</li>
</ul>
<h4>Top Five Tips for Semantic Web-enabling a Conference</h4>
<p>1. Exploit Existing Workflows</p>
<p>Conferences are incredibly data-rich, but much of this richness is bound up in systems for e.g. paper submission, delegate registration, and scheduling, that aren&#8217;t native to the Semantic Web. Recognise this in advance and plan for how you intend to get the data from these systems out into the Web. The good news is that scripts now exists to handle dumps from submission systems such as EasyChair, but you may need to ensure that the conference instance of these systems is configured correctly for your needs. For example, getting dumps from these systems often comes at a price, and if you&#8217;re using one instance per track rather than the multi-track options, you may be in for a shock when you ask for the dumps. Speak to the Programme Chairs about this as soon as possible.</p>
<p>In my experience, delegate registration opens months in advance of a conference and often uses a proprietary, one-off system. As early as possible make contact with the person who will be developing and/or running this system, and agree how the registration system can be extended to collect data about the delegates and their affiliations, for example. Obviously there needs to be an opt-in process before this data is published on the public Web.</p>
<p>Collecting these types of data from existing workflows is so monumentally easier than asking people to submit it later through some dedicated means. With this in mind, have modest expectations (in terms of degree of participation) for any system you hope to deploy for people to use before, during and after the conference, whether this is a personalised schedule planner, paper annotation system or rating system for local restaurants. People have massive demands on their time always, and especially at a conference, so any system that isn&#8217;t already part of a workflow they are engaged with is likely to get limited uptake.</p>
<p>2. Publish Data Early then Incrementally Improve</p>
<p>Perhaps your goal in publishing RDF data about your conference is simply to do the right thing by eating your own dog food and providing an archival record of the event in machine-readable form. This is fine, but ideally you want people to use the published data before and during the event, not just afterwards. In an ideal world, people will use the data you publish as a foundation for demos of their applications and services and the conference, as means to enhance the event and also to promote their own work. To maximise the chances of this happening you need to make it clear in advance that you will be publishing this data, and give an indication of what the scope of this will be. The RDF available from previous events in the ESWC and ISWC series can give an impression of the shape of the data you will publish (assuming you follow the same modelling patterns), but get samples out early and basic structures in place so people have the chance to prepare. Better to incrementally enhance something than save it all up for a big bang just one week before the conference.</p>
<p>3. Attend to the details</p>
<p>Many of the recent ESWC and ISWC events have done a great job of publishing conference data, and have certainly streamlined the process considerably. However, along the way we&#8217;ve lost (or failed to attend to) some of the small but significant facts that relate to a conference, such as the location, venue, sponsors and keynote speakers. This stuff matters, and is the kind of data that probably doesn&#8217;t get recorded elsewhere. Obviously publishing data about the conference papers is important, but from an archival point of view this information is at least recorded by the publishers of the proceedings. The more tacit, historical knowledge about a conference series may be of great interest in the future, but is at risk of slipping away.</p>
<p>4. Piggy-back on Existing Infrastructure</p>
<p>As I discovered while coordinating the Semantic Web Technologies for ESWC2006, deploying event-specific services is simply making a rod for your own back. Who is going to ensure these stay alive after the event is over and everyone moves onto the next thing? The answer is probably no-one. The domain-registration will lapse, the server will get hacked or develop a fault, the person who once knew why that site mattered will take a job elsewhere, and the data will disappear in the process. Therefore it&#8217;s critical that every event uses infrastructure that is already embedded in everyday usage and also/therefore has a future. The best example of this is data.semanticweb.org, the de facto home for Linked Data from Web-related events. This service has support from SWSA, and enough buy-in from the community, to minimise the risk that it will ever go away. By all means host the data on the conference Web site if you must, but don&#8217;t dream of not mirroring it at data.semanticweb.org, with owl:sameAs links to equivalent URIs in that namespace for all entities in your data set.</p>
<p>5. Put Your Data in the Web</p>
<p>Remember that while putting your data on the Web for others to use is a great start, it&#8217;s going to be of greatest use to people if it&#8217;s also *in* the Web. This is a frequently overlooked distinction, but it really matters. No one in their right mind would dream of having a Web site with no incoming or outgoing links, and the same applies to data. Wherever possible the entities in your data set need to be linked to related entities in other data sets. This could be as simple as linking the conference venue to the town in which it is located, where the URI for the town comes from Geonames. Linking in this way ensures that consumers of the data can discover related information, and avoids you having to publish redundant information that already exists somewhere else on the Web. The really great news is that data.semanticweb.org already provides URIs for many people who have published in the Semantic Web field, and (aside from some complexities with special characters in names) linking to these really can be achieved in one line of code. When it&#8217;s this easy there really are no excuses.</p>
<p><strong>Conclusions</strong></p>
<p>Reading the above points back before I hit publish, I realise they focus on Semantic Web-enabling the conference as a whole, rather than specifically the conference Web site, which was the focus of Chris&#8217;s original question. I think we know a decent amount about <a rel="nofollow" title="publishing Linked Data on the Web" target="_blank" href="http://linkeddata.org/docs/how-to-publish">publishing Linked Data on the Web</a>, so hopefully these tips usefully address the more process-oriented than technical aspects.</p> <p>Related posts:<ol><li><a rel="nofollow" target="_blank" href='http://tomheath.com/blog/2009/02/linked-data-tutorials-at-semantic-web-austin/' title='Permanent Link: Linked Data Tutorials at Semantic Web Austin'>Linked Data Tutorials at Semantic Web Austin</a> <small>I spent a few days last week in Austin, Texas,...</small></li></ol></p>
<p>Related posts brought to you by <a rel="nofollow" target="_blank" href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
      </item>
      <item>
         <title>Real Travel Chooses Exalead CloudView Search - Market Wire (press release)</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/7u1I9mJA3kw/industry-news-real-travel-chooses-exalead-cloudview-search-market-wire-press-release.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=7u1I9mJA3kw:LoNPEVceFu8:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=7u1I9mJA3kw:LoNPEVceFu8:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=7u1I9mJA3kw:LoNPEVceFu8:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/7u1I9mJA3kw&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4718 at http://www.semanticuniverse.com</guid>
         <pubDate>Thu, 19 Nov 2009 04:11:17 -0800</pubDate>
      </item>
      <item>
         <title>Computers Can't Answer Everything - MIT Technology Review (blog)</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/DwNKwcfRJHQ/industry-news-computers-cant-answer-everything-mit-technology-review-blog.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=DwNKwcfRJHQ:N2DRsKpnmYY:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=DwNKwcfRJHQ:N2DRsKpnmYY:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=DwNKwcfRJHQ:N2DRsKpnmYY:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/DwNKwcfRJHQ&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4721 at http://www.semanticuniverse.com</guid>
         <pubDate>Wed, 18 Nov 2009 21:06:48 -0800</pubDate>
      </item>
      <item>
         <title>Microsoft live labs introduces Pivot visual search - Neowin</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/ZCcynnE3IW4/industry-news-microsoft-live-labs-introduces-pivot-visual-search-neowin.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=ZCcynnE3IW4:z1XJNi_aftc:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=ZCcynnE3IW4:z1XJNi_aftc:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=ZCcynnE3IW4:z1XJNi_aftc:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/ZCcynnE3IW4&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4725 at http://www.semanticuniverse.com</guid>
         <pubDate>Wed, 18 Nov 2009 15:09:58 -0800</pubDate>
      </item>
      <item>
         <title>Detained in Heathrow</title>
         <link>http://blogs.sun.com/bblfish/entry/detained_in_heathrow</link>
         <description>&lt;p&gt;Sipping a coffee in Heathrow, after having - finally - picked up my computer and bicycle that just arrived back from the US, following &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://blogs.sun.com/bblfish/entry/7_days_in_sf_jail&quot;&gt;my recent adventure in San Francisco&lt;/a&gt;. Thanks to a very friendly Ernesto Smith from British Airways, who very kindly dealt with the paper work at the police lost and found at SFO, and forwarded my belongings to London.&lt;/p&gt;
&lt;p&gt;As I was catching up on my last 2 weeks of e-mail &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://mmt.me.uk/&quot;&gt;Mischa Tuffield&lt;/a&gt; kindly sent me a few links to the following PHD Comics cartoon. :-)&lt;/p&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.phdcomics.com/comics/archive.php?comicid=1243&quot;&gt;&lt;img title=&quot;phd comics detained&quot; src=&quot;http://www.phdcomics.com/comics/archive/phd102609s.gif&quot;&gt;&lt;/a&gt;
&lt;p&gt;Click on the image for the following episodes.&lt;/p&gt;
&lt;p&gt;He had it easy. In the UK, they even let him go out to seek a hotel! Perhaps what I need is a Phd...&lt;/p&gt;</description>
         <guid isPermaLink="false">http://blogs.sun.com/bblfish/entry/detained_in_heathrow</guid>
         <pubDate>Wed, 18 Nov 2009 09:49:44 -0800</pubDate>
      </item>
      <item>
         <title>DataIncubator: What Is It and What's In It?</title>
         <link>http://feedproxy.google.com/~r/Nodalities/~3/dETYvrPWU2Q/dataincubator.php</link>
         <description>by Leigh Dodds
&amp;#124; this article first appeared in Nodalities Magazine, issue 8
The Linking Open Data project has had a huge amount of success in bootstrapping the burgeoning Linked Data cloud. There&amp;#8217;s now a definite sense of momentum behind the project, and a growing number of organisations are now seriously investigating how their data could further [...]</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_858345978aa6bafc1e8228786aae7ded</guid>
         <pubDate>Wed, 18 Nov 2009 00:05:16 -0800</pubDate>
         <content:encoded/>
      </item>
      <item>
         <title>Put in your postcode, out comes the data - Times Online</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/NOhKhu5_xpI/industry-news-put-your-postcode-out-comes-data-times-online.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=NOhKhu5_xpI:f96PzxQSsj8:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=NOhKhu5_xpI:f96PzxQSsj8:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=NOhKhu5_xpI:f96PzxQSsj8:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/NOhKhu5_xpI&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4714 at http://www.semanticuniverse.com</guid>
         <pubDate>Tue, 17 Nov 2009 13:09:37 -0800</pubDate>
      </item>
      <item>
         <title>data.gov.uk and the Talis Platform</title>
         <link>http://feedproxy.google.com/~r/Nodalities/~3/Ii7p9mu59rA/data-gov-uk-and-the-talis-platform.php</link>
         <description>Earlier this year Gordon Brown appointed Tim Berners-Lee as an advisor to the Cabinet Office to help the government begin the process of opening up its data. This was one part of the initiation of a project to begin opening up UK government data in a similar style to the US. A key part of [...]</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_8d4a6866c3d041a7072f02f71e6e3038</guid>
         <pubDate>Tue, 17 Nov 2009 02:19:15 -0800</pubDate>
         <content:encoded/>
      </item>
      <item>
         <title>Worio next generation of tailored search engines - Vancouver Sun</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/8bQ2FM1dM7g/industry-news-worio-next-generation-tailored-search-engines-vancouver-sun.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=8bQ2FM1dM7g:2hzLHCN7rV0:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=8bQ2FM1dM7g:2hzLHCN7rV0:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=8bQ2FM1dM7g:2hzLHCN7rV0:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/8bQ2FM1dM7g&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4704 at http://www.semanticuniverse.com</guid>
         <pubDate>Tue, 17 Nov 2009 01:09:06 -0800</pubDate>
      </item>
      <item>
         <title>When Linked Data Rules Fail</title>
         <link>http://feedproxy.google.com/~r/FredOnSomething/~3/SUH3v-KgZ0Y/</link>
         <description>High Visibility Problems with NYT, data.gov Show Need for Better
Practices When I say, &quot;shot&quot;, what do you think of? A flu shot? A shot of whisky? A moon shot? A gun shot? What if I add the term &quot;bank&quot;? Do you now think of someone being shot in an armed robbery ...</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_e30c95231bb9e110c739d27273e5a3ed</guid>
         <pubDate>Mon, 16 Nov 2009 09:03:11 -0800</pubDate>
         <content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=When Linked Data Rules Fail&amp;rft.aulast=Giasson&amp;rft.aufirst=Fr&#xe9;d&#xe9;rick&amp;rft.subject=Semantic Web&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-11-16&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/11/16/when-linked-data-rules-fail/&amp;rft.language=English"></span>
<p><a rel="nofollow" target="_blank" href="http://www.adhd-mindbydesign.com"><img style="border:0px solid;width:220px;height:223px;float:left;margin-right:10px;" title="Image Source: www.adhd-mindbydesign.com" src="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_disconnected.jpg" alt="Image Source: www.adhd-mindbydesign.com" hspace="5" vspace="5" align="left"/></a></p>
<h2>High Visibility Problems with NYT, data.gov Show Need for Better<br />
Practices</h2>
<p>When I say, &#8220;shot&#8221;, what do you think of? A flu shot? A shot of whisky? A moon shot? A gun shot? What if I add the term &#8220;bank&#8221;? Do you now think of someone being shot in an armed robbery of a local bank or similar?</p>
<p>And, now, what if I add a reference to say, <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://en.wikipedia.org/wiki/The_Hustler_%28film%29">The Hustler</a>, or Minnesota Fats, or &#8220;Fast Eddie&#8221; Felson? Do you now see the connection to a pressure-packed banked pool shot in some smoky bar room?</p>
<p>As humans we need context to make connections and remove ambiguity. For machines, with their limited reasoning and inference engines, context and accurate connections are even more important.</p>
<p>Over the past few weeks we have seen announcements of two large and high-visibility <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Linked_data">linked data</a></p>
<p>projects: One, a first release of references for articles concerning about 5,000 people from the New York Times at <a rel="nofollow" class="http" target="_blank" href="http://data.nytimes.com/">data.nytimes.com</a>; and Two, a massive exposure of 5 billion triples from <a rel="nofollow" target="_blank" href="http://tw.rpi.edu/">data.gov</a> datasets provided by the <a rel="nofollow" target="_blank" href="http://tw.rpi.edu/">Tetherless World Constellation</a> (TWC) at <a rel="nofollow" target="_blank" href="http://rpi.edu/">Rennselaer Polytechnic Institute</a> (RPI).</p>
<p>On various grounds from <a rel="nofollow" target="_blank" href="http://go-to-hellman.blogspot.com/2009/10/new-york-times-blunders-into-linked.html"> licensing</a> to <a rel="nofollow" target="_blank" href="http://dowhatimean.net/2009/10/linked-data-at-the-new-york-times-exciting-but-buggy">data characterization</a> and to creating linked data for its <a rel="nofollow" target="_blank" href="http://www.betaversion.org/%7Estefano/linotype/news/351/">own sake</a>, some prominent commentators have weighed in on what is good and what is not so good with these datasets. One of us, Mike, <a rel="nofollow" target="_blank" href="http://www.mkbergman.com/843/must-read-data-smoke-and-mirrors/">commented</a> about a week ago that &#8220;we have now moved beyond &#8216;proof of concept&#8217; to<br />
the need for actual useful data of trustworthy provenance and proper mapping and characterization. Recent efforts are a disappointment that no enterprise would or could rely upon.&#8221;</p>
<p>Reactions to <a rel="nofollow" target="_blank" href="http://www.mkbergman.com/843/must-read-data-smoke-and-mirrors/">that posting</a> and continued discussion on various <a rel="nofollow" target="_blank" href="http://lists.w3.org/Archives/Public/public-esw-thes/2009Nov/0000.html"> mailing lists</a> warrant a more precise dissection of what is wrong and still needs to be done with these datasets <a rel="nofollow" href="#ld1">[1]</a>.<br />
<h3>Berners-Lee&#8217;s Four Linked Data &#8220;Rules&#8221;</h3>
<p> It is useful, then, to return to first principles, namely the original four &#8220;rules&#8221; posed by Tim Berners-Lee in his design note on linked data <a rel="nofollow" href="#ld2">[2]</a>:</p>
<ol>
<li>Use URIs as names for things</li>
<li>Use HTTP URIs so that people can look up those names</li>
<li>When someone looks up a URI, provide useful information, using thestandards (RDF, SPARQL)</li>
<li>Include links to other URIs so that they can discover more things.</li>
</ol>
<p>The first two rules are definitional to the idea of linked data. They cement the basis of linked data in the Web, and are not at issue with either of the two linked data projects that are the subject of this posting.</p>
<p>However, it is the lack of specifics and guidance in the last two rules where the breakdowns occur. Both the NYT and the RPI datasets suffer from a lack of &#8220;providing useful information&#8221; (Rule #3). And, the <span class="double_u">nature</span> of the links in Rule #4 is a real problem for the NYT dataset.<br />
<h3>What Constitutes &#8220;Useful Information&#8221;?</h3>
<p> The Wikipedia entry on <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Linked_data">linked data</a> expands on &#8220;useful information&#8221; by augmenting the original rule with the parenthetical clause, &#8221; (<span style="font-style:italic;">i.e.</span>, a structured description  metadata).&#8221; But even that expansion is insufficient.</p>
<p>Fundamentally, what are we talking about with linked data? Well, we are talking about instances that are characterized by one or more attributes. Those instances exist within contexts of various natures. And, those contexts may relate to other existing contexts.</p>
<p>We can break this problem description down into three parts:</p>
<ul>
<li>A <span style="font-weight:bold;font-style:italic;">vocabulary</span> that defines the nature of the instances and their descriptive attributes</li>
<li>A <span style="font-weight:bold;font-style:italic;">schema</span> of some nature that describes the structural relationships amongst instances and their characteristics, and, optimally,</li>
<li>A <span style="font-weight:bold;font-style:italic;">mapping</span> to existing external schema or constructs that help place the data into context.</li>
</ul>
<p>At minimum, <span class="double_u">ANY</span> dataset exposed as linked data needs to be described by a <span style="font-weight:bold;font-style:italic;">vocabulary</span>. Both the NYT and RPI datasets fail on this score, as we elaborate below. Better practice is to also provide a <span style="font-weight:bold;font-style:italic;">schema</span> of relationships in which to embed each instance record. And, best practice is to also <span style="font-weight:bold;font-style:italic;">map</span> those structures to external schema.</p>
<p>Lacking this &#8220;useful information&#8221;, especially a defining vocabulary, we cannot begin to understand whether our instances deal with drinks, bank robberies or pool shots. This lack, in essence, makes the information worthless, even though available via URL.<br />
<h4>The data.gov (RPI) Case</h4>
<p> With the support of NSF and various grant funding, RPI has set up the<br />
<a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/The_Data-gov_Wiki">Data-Gov Wiki</a> <a rel="nofollow" href="#ld3">[3]</a>, which is in the process of converting the datasets on <a rel="nofollow">data.gov</a> to RDF,placing them into a semantic wiki to enable comment and annotation, and providing that data as RSS feeds. Other demos are also being placed on the site.</p>
<p>As of the date of this posting, the site had a <a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Data.gov_Catalog">catalog</a> of 116 datasets from the 800 or so available on data.gov, leading to these statistics:</p>
<ul>
<li>459,412,419 table entries</li>
<li>5,074,932,510 triples, and</li>
<li>7,564 properties (or attributes).</li>
</ul>
<p>We&#8217;ll take one of these datasets, <a rel="nofollow" target="_blank" href="http://www.data.gov/details/319">#319</a>, and look a bit closer at it:</p>
<table border="1" cellspacing="0" cellpadding="4">
<tbody>
<tr>
<th style="background-color:#cccccc;">Wiki</th>
<th style="background-color:#cccccc;"> Title</th>
<th style="background-color:#cccccc;"> Agency</th>
<th style="background-color:#cccccc;"> Name</th>
<th style="background-color:#cccccc;"> data.gov Link</th>
<th style="background-color:#cccccc;"> No Properties</th>
<th style="background-color:#cccccc;"> No Triples</th>
<th style="background-color:#cccccc;">RDF File</th>
</tr>
<tr>
<td><a rel="nofollow" title="Dataset 319" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Dataset_319">Dataset 319</a></td>
<td>Consumer Expenditure Survey</td>
<td><a rel="nofollow" title="Department of Labor" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Department_of_Labor">Department of Labor</a></td>
<td><a rel="nofollow" title="LABOR-STAT (page does not exist)" target="_blank" href="http://data-gov.tw.rpi.edu/w/index.php?title=LABOR-STAT&amp;action=edit&amp;redlink=1">LABOR-STAT</a></td>
<td><a rel="nofollow" title="http://www.data.gov/details/319" target="_blank" href="http://www.data.gov/details/319">http://www.data.gov/details/319</a></td>
<td style="text-align:right;">22</td>
<td style="text-align:right;">1,583,236</td>
<td><a rel="nofollow" title="http://data-gov.tw.rpi.edu/raw/319/index.rdf" target="_blank" href="http://data-gov.tw.rpi.edu/raw/319/index.rdf">http://data-gov.tw.rpi.edu/raw/319/index.rdf</a></td>
</tr>
</tbody>
</table>
<p>This report was picked solely because it had a small number of attributes (properties), and is thus easier to screen capture. The summary report on the wiki is shown by this <a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Dataset_319">page</a>:</p>
<div style="margin:10px;">
<p><a rel="nofollow" target="_blank" href="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_wiki_dataset_319.png"><br />
<img class="center" style="border:0px solid;width:600px;height:611px;" title="Click to expand" src="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_wiki_dataset_319.png" alt="Data-gov-Wiki Dataset #319"/></a></p>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>So, we see that this specific dataset contains about 22 of the nearly 8,000 attributes across all datasets.</p>
<p>When we click on one of these attribute names, we are then taken to a specific wiki page that only reiterates its label. There is no definition or explanation.</p>
<p>When we inspect this page further we see that, other than the broad characterization of the dataset itself (the bulk of the page), we see at the bottom 22 undefined attributes with labels such as <span style="font-style:italic;">item code</span>, <span style="font-style:italic;">periodicity code</span>, <span style="font-style:italic;">seasonal</span>, and the like. These attributes are the real structural basis for the data in this dataset.</p>
<p>But, what does all of this mean???</p>
<p>To gain a clue, now let&#8217;s go to the source data.gov site for this <a rel="nofollow" target="_blank" href="http://www.data.gov/details/319">dataset (#319)</a>. Here is how that report looks:</p>
<div style="margin:10px;">
<p><a rel="nofollow" target="_blank" href="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_data_gov_319.png"><br />
<img class="center" style="border:0px solid;width:600px;height:1146px;" title="Click to expand" src="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_data_gov_319.png" alt="Data.gov Dataset #319"/></a></p>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p> Contained within this report we see a listing for additional <a rel="nofollow" target="_blank" href="ftp://ftp.bls.gov/pub/time.series/cx/cx.txt">metadata</a>. This link tells us about the various data fields contained in this dataset; we see many of these attributes are &#8220;codes&#8221; to various data categories.</p>
<p>Probing further into the dataset&#8217;s <a rel="nofollow" target="_blank" href="http://www.bls.gov/cex/">technical documentation</a>, we see that there is indeed a rich structure underneath this report, again provided<br />
via various code lookups. There are codes for geography, seasonality (adjusted or not), consumer demographic profiles and a variety of consumption categories. (See, for example, the link to this <a rel="nofollow" target="_blank" href="http://www.bls.gov/cex/csxgloss.htm">glossary page</a>.) These are the keys to understanding the actual values within this dataset.</p>
<p>For example, one major dimension of the data is captured by the attribute <span style="font-style:italic;">item_code</span>. The survey breaks down consumption expenditures within the broad categories of Food, Housing, Apparel and Services, Transportation, Health Care, Entertainment, and Other. Within a category, there is also a rich structural breakdown. For xample, expenditures for Bakery Products within Food is given a <a rel="nofollow" target="_blank" href="ftp://ftp.bls.gov/pub/time.series/cx/cx.item">code</a> of FHC2.</p>
<p>But, nowhere are these codes defined or unlocked in the RDF datasets. This absence is true for virtually all of the datasets exposed on this wiki.</p>
<p>So, for literally billions of triples, and 8,000 attributes, we have <span style="font-weight:bold;">ABSOLUTELY NO INFORMATION ABOUT WHAT THE DATA CONTAINS OTHER THAN A PROPERTY LABEL</span>. There is much,much rich value here in data.gov, but all of it remains locked up and hidden.</p>
<p>The sad truth about this data release is that it provides absolutely no value in its current form. We lack the keys to unlock the value.</p>
<p>To be sure, early essential spade work has been done here to begin putting in place the conversion infrastructure for moving text files, spreadsheets and the like to an RDF form. This is yeoman work important to ultimate access. But, until a <span style="font-weight:bold;font-style:italic;">vocabulary</span> is published that defines the attributes and their codes so we can unlock this value, it will remain hidden. And only when its further value (by connecting attributes and relations across datasets) through a <span style="font-weight:bold;font-style:italic;">schema</span> of some nature is also published, the real value from connecting the dots will also remain hidden.<img style="width:160px;height:218px;float:right;margin-left:10px;" title="The Hustler" src="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_the_hustler.jpg" alt="The Hustler" align="right"/></p>
<p>These datasets may meet the partial conditions of providing clickable URLs, but the crucial &#8220;useful information&#8221; as to what any of this data means is absent.</p>
<p>Every single dataset on data.gov has supporting references to text files, PDFs, Web pages or the like that describe the nature of the data within each dataset. Until that information is exposed and made usable, we have no linked data. </p>
<p>Until ontologies get created from these technical documents, the value of these data instances remain locked up, and no value can be created from having these datasets expressed in RDF.</p>
<p>The devil lies in the details. The essential hard work has not yet begun.</p>
<h4>The NYT Case</h4>
<p>Though at a much smaller scale with many fewer attributes, the <a rel="nofollow" target="_blank" href="http://data.nytimes.com">NYT dataset</a> suffers from the same failing: it too lacks a <span style="font-weight:bold;font-style:italic;">vocabulary</span>.</p>
<p>So, let&#8217;s take the case of one of the lead actors in <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://en.wikipedia.org/wiki/The_Hustler_%28film%29">The Hustler</a>, Paul Newman, who played the role of &#8220;Fast Eddie&#8221; Felson. Here is the <a rel="nofollow" target="_blank" href="http://data.nytimes.com/N31738445835662083893.html">NYT record</a> for the &#8220;person&#8221; <span style="font-style:italic;">Paul<br />
Newman</span> (which they also refer to as <a rel="nofollow" target="_blank" href="http://data.nytimes.com/newman_paul_per">http://data.nytimes.com/newman_paul_per</a>). Note the header title of <span style="font-weight:bold;">Newman, Paul</span>:</p>
<div style="margin:10px;">
<p><a rel="nofollow" target="_blank" href="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_nyt_paul_newman.png"><br />
<img class="center" style="border:0px solid;width:600px;height:593px;" title="Click to expand" src="http://fgiasson.com/blog/wp-content/uploads/2009/11/091115_nyt_paul_newman.png" alt="NYT 'Paul Newman Articles' Record"/></a></p>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p> Click on any of the internal labels used by the NYT for its own attributes (such as <a rel="nofollow">nyt:first_use</a>), and you will be given this message:</p>
<div style="margin-left:40px;">
<p><span style="font-style:italic;">&#8220;An RDFS description and English language documentation for the NYT namespace will be provided soon. Thanks for your patience.&#8221;</span></div>
<p>We again have no idea what is meant by all of this data except for the labels used for its attributes. In this case for <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements first_use">nyt:first_use</a> we have a value of &#8220;2001-03-18&#8243;.</p>
<p>Hello? What? What is a &#8220;first use&#8221; for a &#8220;Paul Newman&#8221; of &#8220;2001-03-18&#8243;???</p>
<p>The NYT put the cart before the horse: even if minimal, they should have released their ontology first  or at least at the same time  as they released their data instances. (See further <a rel="nofollow" target="_blank" href="http://www.mkbergman.com/825/fresh-perspectives-on-the-semantic-enterprise/"> this discussion</a> about how an ontology creation workflow can be incremental by starting simple and then upgrading as needed.) </p>
<h3>Links to Other Things</h3>
<p>Since there really are no links to other things on the Data-Gov Wiki, our focus in this section continues with the NYT dataset using our same example.</p>
<p>We now are in the territory of the fourth &#8220;rule&#8221; of linked data: <span style="font-style:italic;">4. Include links to other URIs so that they can discover more things</span>.</p>
<p>This will seem a bit basic at first, but before we can talk about linking to other things, we first need to understand and define the starting &#8220;thing&#8221; to which we are linking.<br />
<h4>What is a &#8220;Newman, Paul&#8221; Thing?</h4>
<p> Of course, without its own vocabulary, we are left to deduce what this thing &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8220; <span class="double_u">is</span> that is shown in the previous screen shot. Our first clue comes from the statement that it is of <span style="font-style:italic;">rdf:type</span> <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/skos-reference/">SKOS</a> <span style="font-style:italic;">concept</span>. By looking to the SKOS vocabulary, we see that <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/skos-reference/#concepts"><span style="font-style:italic;">concept</span></a> is a class and is defined as: </p>
<p style="margin-left:40px;font-style:italic;">A SKOS concept can be viewed as an idea or notion; a unit of thought. However, what constitutes a unit of thought is subjective, and this<br />
definition is meant to be suggestive, rather than restrictive. The notion of a SKOS concept is useful when describing the conceptual or intellectual structure of a knowledge organization system, and when referring to specific ideas or meanings established within a KOS.</p>
<p>We also see that this instance is given a <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/0.1/primaryTopic">foaf:primaryTopic</a> of <span style="font-style:italic;">Paul Newman</span>.</p>
<p>So, we can deduce so far that this instance is about the concept or idea of <span style="font-style:italic;">Paul Newman</span>. Now, looking to the attributes of this instance  that is the defining properties provided by the NYT  we see the properties of <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/associated_article_count">nyt:associated_article_count</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/last_use">nyt:last_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/topicPage">nyt:topicPage</a>. Completing our deductions, and in the absence of its own vocabulary, we can now define this concept instance somewhat as follows:
<p style="margin-left:40px;"><span style="font-style:italic;">New York Times articles in the period 2001 to 2009 having as their primary topic the actor Paul Newman</span></p>
<p>(BTW, across all records in this dataset, we could see what the earliest first use was to better deduce the time period over which these articles have been assembled, but that has not been done.)</p>
<p>We also would re-title this instance more akin to &#8220;2001-2009 NYT Articles with a Primary Topic of Paul Newman&#8221; or some such and use URIs more akin to this usage. </p>
<h4>sameAs Woes</h4>
<p>Thus, in order to make links or connections with other data, it is essential to understand what the nature is of the subject &#8220;thing&#8221; at hand. There is much confusion about actual &#8220;things&#8221; and the references to &#8220;things&#8221; and what is the nature of a &#8220;thing&#8221; within the literature and on mailing lists.</p>
<p>Our belief and usage in matters of the semantic Web is that all &#8220;things&#8221; we deal with are a reference to whatever the &#8220;true&#8221;, actual thing is. The question then becomes: What is the nature (or scope) of this referent?</p>
<p>There are actually quite easy ways to determine this nature. First, look to one or more instance examples of the &#8220;thing&#8221; being referred to. In our case above, we have the &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8221; instance record. Then, look to the properties (or attributes) the publisher of that record has used to describe that thing. Again, in the case above, we have <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/associated_article_count">nyt:associated_article_count</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/latest_use">nyt:last_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/topicPage">nyt:topicPage</a>.</p>
<p>Clearly, this instance record  that is, its nature  deals with articles or groups of articles. The relation to <span style="font-style:italic;">Paul Newman</span> occurs as a basis of<br />
the <span class="double_u">primary topic</span> of these articles, and not a <span class="double_u">person</span> basis for which to describe the instance. If the nature of the instance was indeed the person <span style="font-style:italic;">Paul Newman</span>, then the attributes of the record would more properly be related to &#8220;person&#8221; properties such as age, sex, birthdate, death date, marital status, etc.</p>
<p>This confusion by NYT as to the nature of the &#8220;things&#8221; they are describing then leads to some very serious errors. By confusing the topic (<span style="font-style:italic;">Paul Newman</span>) of a record with the nature of that record (articles about topics), NYT next misuses one of the most powerful semantic Web predicates available, <span style="font-weight:bold;">owl:sameAs</span>.</p>
<p>By asserting in the &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8221; record that the instance has a <span style="font-weight:bold;">sameAs</span> relationship with external records in <a rel="nofollow" target="_blank" href="http://rdf.freebase.com/ns/en.paul_newman">Freebase</a> and <a rel="nofollow" target="_blank" href="http://dbpedia.org/resource/Paul_Newman">DBpedia</a>, the NYT both <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Entailment">entail</a>s that properties from any of the associated records are shared and <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Inference">infers</a> a chain of other types to describe the record. More precisely, the NYT is asserting that the &#8220;thing&#8221; referred to by these instances are <strong class="moz-txt-star">identical</strong> resources.</p>
<p>Thus, by the <span style="font-weight:bold;">sameAs</span> statements in the <span style="font-weight:bold;">Newman, Paul</span> record, the NYT is also asserting that that record is an instance of all these classes:</p>
<table border="0">
<tbody>
<tr>
<td></td>
<td>
<ul>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/about/html/http://www.w3.org/2002/07/owl%23Thing">owl:Thing</a></li>
<li> <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/spec/#term_Agent">foaf:Agent</a></li>
<li> <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/spec/#term_Person">foaf:Person</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/ontology/Actor">dbpedia-owl:Actor</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/JewishActors">http://dbpedia.org/class/yago/JewishActors</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/PeopleFromCleveland,Ohio">http://dbpedia.org/class/yago/PeopleFromCleveland,Ohio</a></li>
<li><a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/ontology/Artist">dbpedia-owl:Artist</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/ontology/Person">dbpedia-owl:Person</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/Person100007846">http://dbpedia.org/class/yago/Person100007846</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/AmericanFilmDirectors">http://dbpedia.org/class/yago/AmericanFilmDirectors</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/YaleUniversityAlumni">http://dbpedia.org/class/yago/YaleUniversityAlumni</a></li>
<li><a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/OhioUniversityAlumni">http://dbpedia.org/class/yago/OhioUniversityAlumni</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjWoZwpEbGdrcN5Y29ycA">opencyc:en/MaleHuman</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/AmericanFilmActors">http://dbpedia.org/class/yago/AmericanFilmActors</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/Liberals">http://dbpedia.org/class/yago/Liberals</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/OhioActors">http://dbpedia.org/class/yago/OhioActors</a></li>
<li><a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/UnitedStatesNavySailors">http://dbpedia.org/class/yago/UnitedStatesNavySailors</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/PeopleFromWestport,Connecticut"> http://dbpedia.org/class/yago/PeopleFromWestport,Connecticut</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rwQB4UJwpEbGdrcN5Y29ycA"></a> <a rel="nofollow" class="uri" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rwQB4UJwpEbGdrcN5Y29ycA"> opencyc:en/JewishPerson</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rwMRyTJwpEbGdrcN5Y29ycA">opencyc:en/ActorInMovies</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/LivingPeople">http://dbpedia.org/class/yago/LivingPeople</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/Actor109765278">http://dbpedia.org/class/yago/Actor109765278</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/AmericanVegetarians">http://dbpedia.org/class/yago/AmericanVegetarians</a></li>
<li><a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/AmericanPhilanthropists">http://dbpedia.org/class/yago/AmericanPhilanthropists</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/KenyonCollegeAlumni">http://dbpedia.org/class/yago/KenyonCollegeAlumni</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/WesternFilmActors">http://dbpedia.org/class/yago/WesternFilmActors</a></li>
<li> <a rel="nofollow" class="uri" target="_blank" href="http://dbpedia.org/class/yago/ActorsStudioAlumni">http://dbpedia.org/class/yago/ActorsStudioAlumni</a></li>
<li>and, a hundred other dbpedia_yago superClasses.</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>Furthermore, because of its strong, reciprocal entailments, the <span style="font-weight:bold;">owl:sameAs</span> assertion would also now entail that the person <span style="font-style:italic;">Paul Newman</span> has the <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/latest_use">nyt:last_use</a> attributes, clearly illogical for a &#8220;person&#8221; thing.</p>
<p>This connection is clearly wrong in both directions. <span style="font-style:italic;">Articles</span> are not <span style="font-style:italic;">persons</span> and don&#8217;t have <span style="font-style:italic;">marital status</span>; and <span style="font-style:italic;">persons</span> do not have <span style="font-style:italic;">first_uses</span>. By misapplying this <span style="font-weight:bold;">sameAs</span> linkage relationship, we have screwed things up in every which way. And the error began with misunderstanding what kinds of &#8220;things&#8221; our data is about.</p>
<h4>Some Options</h4>
<p>However, there are solutions. First, the <span style="font-weight:bold;">sameAs</span> assertions, at least involving these external resources, should be dropped.</p>
<p>Second, if linkages are still desired, a vocabulary such as <a rel="nofollow" target="_blank" href="http://umbel.org">UMBEL</a> <a rel="nofollow" href="#ld4">[4]</a> could be used to make an assertion between such a concept, and these other related resources. So, even though these resources are not the same, they are <strong>closely</strong> related. The UMBEL ontology helps us to define this kind of relation between related, but non-identical, resources.</p>
<p>Instead of using the <span style="font-weight:bold;">owl:sameAs</span></p>
<p>property, we would suggest the usage of the <span style="font-weight:bold;">umbel:linksEntity</span>, which links a <span style="font-weight:bold;">skos:Concept</span> to related named entities resources. Additionally, Freebase, which also currently asserts a <span style="font-weight:bold;">sameAs</span> relationship to the NYT resource, could use the <span style="font-weight:bold;">umbel:isAbout</span> relationship to assert that their resource &#8220;is about&#8221; a certain concept, which is the one defined by the NYT.</p>
<p>Alternatively, still other external vocabularies that more precisely capture the intent of the NYT publishers could be found, or the NYT editors could define their own properties specifically addressing their unique linkage interests. </p>
<h4>Other Minor Issues</h4>
<p>As a couple of additional, minor suggestions for the NYT dataset, we would suggest:</p>
<ul>
<li>Create a <span style="font-weight:bold;">foaf:Organization</span> description of the NYT organization, then use it with <span style="font-weight:bold;">dc:creator</span> and <span style="font-weight:bold;">dcterms:rightsHolder</span> rather than using a literal, and</li>
<li>The dual URIs such as &#8220;<a rel="nofollow" target="_blank" href="http://data.nytimes.com/N31738445835662083893">http://data.nytimes.com/N31738445835662083893</a>&#8221; and &#8220;<a rel="nofollow" target="_blank" href="http://data.nytimes.com/newman_paul_per">http://data.nytimes.com/newman_paul_per</a>&#8221; are not wrong in themselves, but the purpose is hard to understand. Why does a single organization need to create multiple resources for the <strong class="moz-txt-star">identical resource,</strong> when it comes from the same system and has the same purpose?</li>
</ul>
<h4>Re-visiting the Linkage &#8220;Rule&#8221;</h4>
<p>There are very valuable benefits from entailment, inference and logic to be gained from linking resources. However, if the nature of the &#8220;things&#8221; being linked  or the properties that define these linkages  are incorrect, then very wrong logical implications result. Great care and understanding should be applied to linkage assertions.</p>
<h3>In the End, the Challenge is Not Linked Data, but <span style="font-style:italic;text-decoration:underline;">Connected</span> Data</h3>
<p>Our critical comments are not meant to be disrespectful and are not being picky. The NYT and TWC are prominent institutions for which we should expect leadership on these issues. Our criticisms (and we believe those of others) are also not an expression of a &#8220;<a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Hype_cycle">trough of disillusionment</a>&#8221; as <a rel="nofollow" target="_blank" href="http://twitter.com/gregboutin/status/5558525462">some</a> have been pointing out.</p>
<p>This posting is about poor practices, pure and simple. The time to correct them is now. If asked, we would be pleased to help either institution establish exemplar practices. This is not automatic, and it is not always easy. The data.gov datasets, in particular, will require much time and effort to get right. There is much documentation that needs to be transitioned and expressed in semantic Web formats.</p>
<p>In a broader sense, we also seem to lack a definition of best practices related to <span style="font-weight:bold;">vocabularies</span>, <span style="font-weight:bold;">schema</span> and <span style="font-weight:bold;">mappings</span>. The Berners-Lee rules are imprecise and insufficient as is. Prior best guidance documents tend to<br />
be more how to publish and make URIs linkable, than to properly characterize, describe and connect the data.</p>
<p>Perhaps, in part, this is a bit of a semantics issue. The challenge is not the mechanics of <span style="font-style:italic;">linking data</span>, but the meaning and basis for <span class="double_u">connecting</span> that data. Connections require logic and rationality sufficient to reliably inform inference and rule-based engines. It also needs to pass the sniff test as we &#8220;follow our nose&#8221; by clicking the links exposed by the data.</p>
<p>It is exciting to see high-quality content such as from national governments and major publishers like the New York Times begin to be exposed as linked data. When this content finally gets embedded into usable contexts, we should see manifest uses and benefits emerge. We hope both institutions take our criticisms in that spirit.</p>
<div style="background-color:#ffffcc;border:1px dotted yellow;margin:15px 60px;padding:8px;vertical-align:middle;margin:0pt 0pt 0pt 10px;width:300px;text-align:center;">This posting has been jointly authored by <a rel="nofollow" target="_blank" href="http://mkbergman.com"> Mike Bergman</a> and <a rel="nofollow" target="_blank" href="http://fgiasson.com/blog">Fred Giasson</a> and simultaneously published on both of their blogs, hoping to draw more attention to the need for better practices in publishing linked data.</div>
<hr style="margin:15px 0px;" size="1"/>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld1" name="ld1"></a> [1] The NYT has been updated with improvements and they fixed multiple issues from the first release. The<br />
problems listed herein, however, still pertain after these improvements.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld2" name="ld2"></a> [2] Tim Berners-Lee, 2006. Linked Data (Design Issues), first posted on 2006-07-27; last updated on<br />
2009-06-18. See <a rel="nofollow" target="_blank" href="http://www.w3.org/DesignIssues/LinkedData.html">http://www.w3.org/DesignIssues/LinkedData.html</a>. Berners-Lee refers to the steps above as &#8220;rules,&#8221; but he elaborates they are expectations of behavior. Most later citations refer to these as &#8220;principles.&#8221;</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld3" name="ld3"></a> [3] Li Ding, Dominic DiFranzo, Sarah Magidson, Deborah L. McGuinness and Jim Hendler, 2009. Data-GovWiki: Towards Linked Government Data. See <a rel="nofollow" target="_blank" href="http://www.cs.vu.nl/%7Epmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf"></a><br />
<a rel="nofollow" target="_blank" href="http://www.cs.vu.nl/%7Epmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf"> http://www.cs.vu.nl/~pmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf</a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld4" name="ld4"></a> [4] UMBEL <em>(Upper Mapping and Binding Exchange Layer)</em> is a lightweight ontology structure in development for relating Web content and data to a standard set of subject concepts. It purpose has resulted in its creation of an associated vocabulary geared to both class-instance and reciprocal relationships, as well as partial or likelihood relationships. See <a rel="nofollow" target="_blank" href="http://umbel.org/technical_documentation.html#vocabulary">http://umbel.org/technical_documentation.html#vocabulary</a>.</div>
<div class="feedflare">
<a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/FredOnSomething?a=SUH3v-KgZ0Y:C4TlomcCNcQ:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/FredOnSomething?d=yIl2AUoC8zA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/FredOnSomething?a=SUH3v-KgZ0Y:C4TlomcCNcQ:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/FredOnSomething?d=7Q72WNTAKBA" border="0"></a>
</div>]]></content:encoded>
      </item>
      <item>
         <title>3 Software Architecture Trends for 2010</title>
         <link>http://www.thewebsemantic.com/2009/11/16/3-software-architecture-trends-for-2010/</link>
         <description>I’ve recently returned from the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.oredev.com/&quot;&gt;Øredev developer’s conference&lt;/a&gt; in Malmo, Sweden where I had the privilege of sharing knowledge with a very eclectic group of technologists . &lt;span&gt; &lt;/span&gt;In addition to existing trends such as language agnosticism on the JVM, Agile, and mobile proliferation I noticed 3 emerging trends that stood out.</description>
         <guid isPermaLink="false">http://www.thewebsemantic.com/?p=186</guid>
         <pubDate>Mon, 16 Nov 2009 15:15:48 -0800</pubDate>
      </item>
      <item>
         <title>Audio: How Semantics Can Help Our Healthcare System</title>
         <link>http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</link>
         <description>&lt;p&gt;
&lt;strong&gt;Scott Koegler&lt;/strong&gt;&lt;br/&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;
&lt;/p&gt; &lt;p&gt;As principal consultant of &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semantec-inc.com&quot;&gt;Semantec&lt;/a&gt;, and also principal consultant for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.tek-health.com&quot;&gt;The Intelligent Healthcare Practice&lt;/a&gt;, Stephen Lahanas is involved in trying to solve issues around one of the most talked-about areas of U.S. concern - the health care system. &lt;/p&gt; &lt;p&gt;Listen to my interview with Lahanas for the specifics of his take on how semantics can and should be leveraged to help our healthcare system.&lt;/p&gt; &lt;p&gt;&lt;iframe class=&quot;embeddedvideo&quot; src=&quot;http://www.semanticweb.com/embed/StephenLahanasInterview.mp3&quot; width=&quot;300&quot; height=&quot;25&quot;&gt; &lt;/p&gt; &lt;p&gt;Stephen covers several key issues in this conversation, including:&lt;/p&gt; &lt;blockquote&gt;*The existing highly different, and often proprietary platforms storing medical data present tremendous difficulty when trying to consolidate the data from these different systems. &lt;p&gt;*Traditional data integration techniques relying on static mapping methodologies are likely to be cumbersome and take a long time to complete, if they can ever be completed. &lt;/p&gt; &lt;p&gt;*The use of semantic technologies to create abstraction layers that bring the different data structures together as common, accessible systems.&lt;/blockquote&gt;&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;stephenlahanas.jpg&quot; src=&quot;http://www.semanticweb.com/original/stephenlahanas.jpg&quot; width=&quot;131&quot; height=&quot;172&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;Lahanas points to the DoD's Alta program and the VA's VISTA program that, combined, have spent about $7 billion. These systems are looking at the Nationwide Health Information Network (NHIN) to get the job completed. But because of the fact that there may be dozens, or even hundreds of data formats to deal with, Lahanas says that dynamically defined integration is likely to be the better choice.&lt;/p&gt; &lt;p&gt;He compares the choice of semantic technologies with the previous set of W3C standards, most prominently XML, and comments on how the adoption of XML actually increased the complexity of integrating systems because of its highly flexible structure. Lahanas sees an end product as a system that allows users to query a unified system for information they need, rather than rely on the current system static reports.&lt;/p&gt; &lt;p&gt;The ultimate goal is to redefine the healthcare IT lifecycle management, going beyond the management of practice information and simple data storage, to the exploitation of the knowledge contained within the data. Lahanas points to the efficiencies that can be gained by creating a transparency between and within huge healthcare systems such as the Army or Air Force.&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</guid>
         <pubDate>Mon, 16 Nov 2009 14:10:43 -0800</pubDate>
         <category>Features</category>
         <enclosure length="10186072" url="http://www.semanticweb.com/embed/StephenLahanasInterview.mp3" type="audio/mpeg"/>
         <enclosure length="13737" url="http://www.semanticweb.com/original/stephenlahanas.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>Audio: How Semantics Can Help Our Healthcare System</title>
         <link>http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</link>
         <description>&lt;p&gt;
&lt;strong&gt;Scott Koegler&lt;/strong&gt;&lt;br/&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;
&lt;/p&gt; &lt;p&gt;As principal consultant of &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semantec-inc.com&quot;&gt;Semantec&lt;/a&gt;, and also principal consultant for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.tek-health.com&quot;&gt;The Intelligent Healthcare Practice&lt;/a&gt;, Stephen Lahanas is involved in trying to solve issues around one of the most talked-about areas of U.S. concern - the health care system. &lt;/p&gt; &lt;p&gt;Listen to my interview with Lahanas for the specifics of his take on how semantics can and should be leveraged to help our healthcare system.&lt;/p&gt; &lt;p&gt;&lt;iframe class=&quot;embeddedvideo&quot; src=&quot;http://www.semanticweb.com/embed/StephenLahanasInterview.mp3&quot; width=&quot;300&quot; height=&quot;25&quot;&gt; &lt;/p&gt; &lt;p&gt;Stephen covers several key issues in this conversation, including:&lt;/p&gt; &lt;blockquote&gt;*The existing highly different, and often proprietary platforms storing medical data present tremendous difficulty when trying to consolidate the data from these different systems. &lt;p&gt;*Traditional data integration techniques relying on static mapping methodologies are likely to be cumbersome and take a long time to complete, if they can ever be completed. &lt;/p&gt; &lt;p&gt;*The use of semantic technologies to create abstraction layers that bring the different data structures together as common, accessible systems.&lt;/blockquote&gt;&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;stephenlahanas.jpg&quot; src=&quot;http://www.semanticweb.com/original/stephenlahanas.jpg&quot; width=&quot;131&quot; height=&quot;172&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;Lahanas points to the DoD's Alta program and the VA's VISTA program that, combined, have spent about $7 billion. These systems are looking at the Nationwide Health Information Network (NHIN) to get the job completed. But because of the fact that there may be dozens, or even hundreds of data formats to deal with, Lahanas says that dynamically defined integration is likely to be the better choice.&lt;/p&gt; &lt;p&gt;He compares the choice of semantic technologies with the previous set of W3C standards, most prominently XML, and comments on how the adoption of XML actually increased the complexity of integrating systems because of its highly flexible structure. Lahanas sees an end product as a system that allows users to query a unified system for information they need, rather than rely on the current system static reports.&lt;/p&gt; &lt;p&gt;The ultimate goal is to redefine the healthcare IT lifecycle management, going beyond the management of practice information and simple data storage, to the exploitation of the knowledge contained within the data. Lahanas points to the efficiencies that can be gained by creating a transparency between and within huge healthcare systems such as the Army or Air Force.&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</guid>
         <pubDate>Mon, 16 Nov 2009 14:10:43 -0800</pubDate>
         <category>Features</category>
         <enclosure length="10186072" url="http://www.semanticweb.com/embed/StephenLahanasInterview.mp3" type="audio/mpeg"/>
         <enclosure length="13737" url="http://www.semanticweb.com/original/stephenlahanas.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>Audio: How Semantics Can Help Our Healthcare System</title>
         <link>http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</link>
         <description>&lt;p&gt;
&lt;strong&gt;Scott Koegler&lt;/strong&gt;&lt;br/&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;
&lt;/p&gt; &lt;p&gt;As principal consultant of &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semantec-inc.com&quot;&gt;Semantec&lt;/a&gt;, and also principal consultant for &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.tek-health.com&quot;&gt;The Intelligent Healthcare Practice&lt;/a&gt;, Stephen Lahanas is involved in trying to solve issues around one of the most talked-about areas of U.S. concern - the health care system. &lt;/p&gt; &lt;p&gt;Listen to my interview with Lahanas for the specifics of his take on how semantics can and should be leveraged to help our healthcare system.&lt;/p&gt; &lt;p&gt;&lt;iframe class=&quot;embeddedvideo&quot; src=&quot;http://www.semanticweb.com/embed/StephenLahanasInterview.mp3&quot; width=&quot;300&quot; height=&quot;25&quot;&gt; &lt;/p&gt; &lt;p&gt;Stephen covers several key issues in this conversation, including:&lt;/p&gt; &lt;blockquote&gt;*The existing highly different, and often proprietary platforms storing medical data present tremendous difficulty when trying to consolidate the data from these different systems. &lt;p&gt;*Traditional data integration techniques relying on static mapping methodologies are likely to be cumbersome and take a long time to complete, if they can ever be completed. &lt;/p&gt; &lt;p&gt;*The use of semantic technologies to create abstraction layers that bring the different data structures together as common, accessible systems.&lt;/blockquote&gt;&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;stephenlahanas.jpg&quot; src=&quot;http://www.semanticweb.com/original/stephenlahanas.jpg&quot; width=&quot;131&quot; height=&quot;172&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;Lahanas points to the DoD's Alta program and the VA's VISTA program that, combined, have spent about $7 billion. These systems are looking at the Nationwide Health Information Network (NHIN) to get the job completed. But because of the fact that there may be dozens, or even hundreds of data formats to deal with, Lahanas says that dynamically defined integration is likely to be the better choice.&lt;/p&gt; &lt;p&gt;He compares the choice of semantic technologies with the previous set of W3C standards, most prominently XML, and comments on how the adoption of XML actually increased the complexity of integrating systems because of its highly flexible structure. Lahanas sees an end product as a system that allows users to query a unified system for information they need, rather than rely on the current system static reports.&lt;/p&gt; &lt;p&gt;The ultimate goal is to redefine the healthcare IT lifecycle management, going beyond the management of practice information and simple data storage, to the exploitation of the knowledge contained within the data. Lahanas points to the efficiencies that can be gained by creating a transparency between and within huge healthcare systems such as the Army or Air Force.&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/audio_how_semantics_can_help_our_healthcare_system_143341.asp?c=rss</guid>
         <pubDate>Mon, 16 Nov 2009 14:10:43 -0800</pubDate>
         <category>Features</category>
         <enclosure length="10186072" url="http://www.semanticweb.com/embed/StephenLahanasInterview.mp3" type="audio/mpeg"/>
         <enclosure length="13737" url="http://www.semanticweb.com/original/stephenlahanas.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>Pellet 2.0 Release</title>
         <link>http://clarkparsia.com/weblog/2009/11/16/pellet-2-release/</link>
         <description>We&amp;#8217;re happy to announce the release of Pellet 2.0, the first OWL 2 DL reasoner available commercially. Pellet 2.0 is available for use in open source projects under the AGPL v.3 license; for commercial usage, alternative license terms are available. During the past 13 months, we closed 199 tickets as part of the 2.0 release [...]</description>
         <guid isPermaLink="false">http://clarkparsia.com/weblog/?p=862</guid>
         <pubDate>Mon, 16 Nov 2009 12:25:10 -0800</pubDate>
         <content:encoded><![CDATA[<p>We&#8217;re happy to announce the release of <a rel="nofollow">Pellet 2.0</a>, the first <span class="caps">OWL</span> 2 DL reasoner available commercially. Pellet 2.0 is available for use in open source projects under the <span class="caps">AGPL </span>v.3 license; for commercial usage, alternative license terms are <a rel="nofollow" target="_blank" href="mailto:%69%6E%71%75%69%72%69%65%73%40%63%6C%61%72%6B%70%61%72%73%69%61%2E%63%6F%6D">available</a>.</p> <p>During the past 13 months, we closed 199 <a rel="nofollow" target="_blank" href="http://clark-parsia.trac.cvsdude.com/pellet-devel/report">tickets</a> as part of the 2.0 release candidate cycle, including numerous enhancements and bug fixes. Please see <span class="caps">CHANGES.</span>txt in the distribution for a complete change log; some highlights include:</p> <ul>
<li>full <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/owl2-overview/"><span class="caps">OWL</span> 2</a> support (modulo a few bugs that will be fixed in the 2.1 release)</li>
<li>supports domain &#038; range axioms, class expressions, qualified cardinality restrictions, literal constants, annotations, and nested class expressions in <span class="caps">SPARQL </span>queries</li>
<li>support for all <span class="caps">SWRL </span>builtins, including previously missing builtins (substring, tokenize, and optional precision parameters for roundHalfToEven)</li>
<li>optimized support for <span class="caps">OWL</span> 2 EL reasoning; <span class="caps">OWL</span> 2 EL reasoner is autoselected based on data input</li>
<li>supports automated ontology module extraction</li>
<li>supports incremental classification</li>
<li>supports fine-grained inference extraction</li>
<li>enhanced <span class="caps">SWRL </span>rules performance</li>
<li><span class="caps">OWLAPI </span>v3 support</li>
<li>lots of improvements, cleanups to Pellet&#8217;s command line tools</li>
<li>updated to work with Jena 2.6.2 &#8212; Pellet is the only DL reasoner available from Jena</li>
<li>supports explanations via Jena</li>
<li>support autoselecting best <span class="caps">SPARQL </span>query engine based on input query</li>
<li>user-defined timeouts for reasoning</li>
<li>switch to dual license model to support commercial and open source projects</li>
</ul> <p>This release marks a change in Pellet development process: starting with 2.1, Pellet will be released according to a time-based development cycle. We will do four quarterly releases per year. We will make point releases between the quarterly releases, as necessary, to fix critical bugs only. Thus, the release schedule for the 2.x series will be 29 March 2010, 28 June 2010, 27 September 2010, 20 December 2010.</p> <p>We believe this new development and release process will further accelerate the commercialization of Pellet, with no undue impact on its utility for either research or other non-commercial applications.</p> <p>Finally, with the release of Pellet 2.0, we will no longer support previous versions via the pellet-users mailing list. </p>]]></content:encoded>
      </item>
      <item>
         <title>When Linked Data Rules Fail</title>
         <link>http://feedproxy.google.com/~r/AI3_AdaptiveInformation/~3/LdAkKAJGeOY/</link>
         <description>&lt;span class=&quot;Z3988&quot; title=&quot;ctx_ver=Z39.88-2004&amp;amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;amp;rft.title=When Linked Data Rules Fail&amp;amp;rft.aulast=Bergman&amp;amp;rft.aufirst=Mike&amp;amp;rft.subject=Linked Data&amp;amp;rft.subject=Ontology Best Practices&amp;amp;rft.subject=Semantic Web&amp;amp;rft.source=AI3:::Adaptive Information&amp;amp;rft.date=2009-11-16&amp;amp;rft.type=blogPost&amp;amp;rft.format=text&amp;amp;rft.identifier=http://www.mkbergman.com/846/when-linked-data-rules-fail/&amp;amp;rft.language=English&quot;&gt;&lt;/span&gt;High Visibility Problems with NYT, data.gov Show Need for Better Practices
When I say, &amp;#8220;shot&amp;#8221;, what do you think of? A flu shot? A shot of whisky? A moon shot? A gun shot? What if I add the term &amp;#8220;bank&amp;#8221;? [...]</description>
         <guid isPermaLink="false">http://www.mkbergman.com/?p=846</guid>
         <pubDate>Mon, 16 Nov 2009 09:04:01 -0800</pubDate>
         <content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=When Linked Data Rules Fail&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Linked Data&amp;rft.subject=Ontology Best Practices&amp;rft.subject=Semantic Web&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2009-11-16&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/846/when-linked-data-rules-fail/&amp;rft.language=English"></span>
<p><a rel="nofollow" target="_blank" href="http://www.adhd-mindbydesign.com/"><img style="border:0px solid;width:220px;height:223px;float:left;margin-right:10px;" title="Image Source: www.adhd-mindbydesign.com" alt="Image Source: www.adhd-mindbydesign.com" hspace="5" vspace="5" align="left"/></a></p>
<h2>High Visibility Problems with NYT, data.gov Show Need for Better Practices</h2>
<p>When I say, &#8220;shot&#8221;, what do you think of? A flu shot? A shot of whisky? A moon shot? A gun shot? What if I add the term &#8220;bank&#8221;? Do you now think of someone being shot in an armed robbery of a local bank or similar?</p>
<p>And, now, what if I add a reference to say, <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://en.wikipedia.org/wiki/The_Hustler_%28film%29">The Hustler</a>, or Minnesota Fats, or &#8220;Fast Eddie&#8221; Felson? Do you now see the connection to a pressure-packed banked pool shot in some smoky bar room?</p>
<p>As humans we need context to make connections and remove ambiguity. For machines, with their limited reasoning and inference engines, context and accurate connections are even more important.</p>
<p>Over the past few weeks we have seen announcements of two large and high-visibility <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Linked_data">linked data</a> projects: One, a first release of references for articles concerning about 5,000 people from the New York Times at <a rel="nofollow" target="_blank" href="http://data.nytimes.com/">data.nytimes.com</a>; and Two, a massive exposure of 5 billion triples from <a rel="nofollow" target="_blank" href="http://tw.rpi.edu/">data.gov</a> datasets provided by the <a rel="nofollow" target="_blank" href="http://tw.rpi.edu/">Tetherless World Constellation</a> (TWC) at <a rel="nofollow" target="_blank" href="http://rpi.edu/">Rennselaer Polytechnic Institute</a> (RPI).</p>
<p>On various grounds from <a rel="nofollow" target="_blank" href="http://go-to-hellman.blogspot.com/2009/10/new-york-times-blunders-into-linked.html"> licensing</a> to <a rel="nofollow" target="_blank" href="http://dowhatimean.net/2009/10/linked-data-at-the-new-york-times-exciting-but-buggy"> data characterization</a> and to creating linked data for its <a rel="nofollow" target="_blank" href="http://www.betaversion.org/%7Estefano/linotype/news/351/">own sake</a>, some prominent commentators have weighed in on what is good and what is not so good with these datasets. One of us, Mike, <a rel="nofollow">commented</a> about a week ago that &#8220;we have now moved beyond &#8216;proof of concept&#8217; to the need for actual useful data of trustworthy provenance and proper mapping and characterization. Recent efforts are a disappointment that no enterprise would or could rely upon.&#8221;</p>
<p>Reactions to <a rel="nofollow">that posting</a> and continued discussion on various <a rel="nofollow" target="_blank" href="http://lists.w3.org/Archives/Public/public-esw-thes/2009Nov/0000.html"> mailing lists</a> warrant a more precise dissection of what is wrong and still needs to be done with these datasets <a rel="nofollow" href="#ld1">[1]</a>.</p>
<h3>Berners-Lee&#8217;s Four Linked Data &#8220;Rules&#8221;</h3>
<p>It is useful, then, to return to first principles, namely the original four &#8220;rules&#8221; posed by Tim Berners-Lee in his design note on linked data <a rel="nofollow" href="#ld2">[2]</a>:</p>
<ol>
<li>Use URIs as names for things</li>
<li>Use HTTP URIs so that people can look up those names</li>
<li>When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)</li>
<li>Include links to other URIs so that they can discover more things.</li>
</ol>
<p>The first two rules are definitional to the idea of linked data. They cement the basis of linked data in the Web, and are not at issue with either of the two linked data projects that are the subject of this posting.</p>
<p>However, it is the lack of specifics and guidance in the last two rules where the breakdowns occur. Both the NYT and the RPI datasets suffer from a lack of &#8220;providing useful information&#8221; (Rule #3). And, the <span class="double_u">nature</span> of the links in Rule #4 is a real problem for the NYT dataset.</p>
<h3>What Constitutes &#8220;Useful Information&#8221;?</h3>
<p>The Wikipedia entry on <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Linked_data">linked data</a> expands on &#8220;useful information&#8221; by augmenting the original rule with the parenthetical clause, &#8221; (<span style="font-style:italic;">i.e.</span>, a structured description — metadata).&#8221; But even that expansion is insufficient.</p>
<p>Fundamentally, what are we talking about with linked data? Well, we are talking about instances that are characterized by one or more attributes. Those instances exist within contexts of various natures. And, those contexts may relate to other existing contexts.</p>
<p>We can break this problem description down into three parts:</p>
<ul>
<li>A <span style="font-weight:bold;font-style:italic;">vocabulary</span> that defines the nature of the instances and their descriptive attributes</li>
<li>A <span style="font-weight:bold;font-style:italic;">schema</span> of some nature that describes the structural relationships amongst instances and their characteristics, and, optimally,</li>
<li>A <span style="font-weight:bold;font-style:italic;">mapping</span> to existing external schema or constructs that help place the data into context.</li>
</ul>
<p>At minimum, <span class="double_u">ANY</span> dataset exposed as linked data needs to be described by a <span style="font-weight:bold;font-style:italic;">vocabulary</span>. Both the NYT and RPI datasets fail on this score, as we elaborate below. Better practice is to also provide a <span style="font-weight:bold;font-style:italic;">schema</span> of relationships in which to embed each instance record. And, best practice is to also <span style="font-weight:bold;font-style:italic;">map</span> those structures to external schema.</p>
<p>Lacking this &#8220;useful information&#8221;, especially a defining vocabulary, we cannot begin to understand whether our instances deal with drinks, bank robberies or pool shots. This lack, in essence, makes the information worthless, even though available via URL.</p>
<h4>The data.gov (RPI) Case</h4>
<p>With the support of NSF and various grant funding, RPI has set up the <a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/The_Data-gov_Wiki">Data-Gov Wiki</a> <a rel="nofollow" href="#ld3">[3]</a>, which is in the process of converting the datasets on <a rel="nofollow" target="_blank" href="http://www.data.gov/">data.gov</a> to RDF, placing them into a semantic wiki to enable comment and annotation, and providing that data as RSS feeds. Other demos are also being placed on the site.</p>
<p>As of the date of this posting, the site had a <a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Data.gov_Catalog">catalog</a> of 116 datasets from the 800 or so available on data.gov, leading to these statistics:</p>
<ul>
<li>459,412,419 table entries</li>
<li>5,074,932,510 triples, and</li>
<li>7,564 properties (or attributes).</li>
</ul>
<p>We&#8217;ll take one of these datasets, <a rel="nofollow" target="_blank" href="http://www.data.gov/details/319">#319</a>, and look a bit closer at it:</p>
<table border="1" cellspacing="0" cellpadding="4">
<tbody>
<tr>
<th style="background-color:#cccccc;"> Wiki</th>
<th style="background-color:#cccccc;"> Title</th>
<th style="background-color:#cccccc;"> Agency</th>
<th style="background-color:#cccccc;"> Name</th>
<th style="background-color:#cccccc;"> data.gov Link</th>
<th style="background-color:#cccccc;"> No Properties</th>
<th style="background-color:#cccccc;"> No Triples</th>
<th style="background-color:#cccccc;"> RDF File</th>
</tr>
<tr>
<td><a rel="nofollow" title="Dataset 319" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Dataset_319">Dataset 319</a></td>
<td>Consumer Expenditure Survey</td>
<td><a rel="nofollow" title="Department of Labor" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Department_of_Labor">Department of Labor</a></td>
<td><a rel="nofollow" title="LABOR-STAT (page does not exist)" target="_blank" href="http://data-gov.tw.rpi.edu/w/index.php?title=LABOR-STAT&amp;action=edit&amp;redlink=1">LABOR-STAT</a></td>
<td><a rel="nofollow" title="http://www.data.gov/details/319" target="_blank" href="http://www.data.gov/details/319">http://www.data.gov/details/319</a></td>
<td style="text-align:right;">22</td>
<td style="text-align:right;">1,583,236</td>
<td><a rel="nofollow" title="http://data-gov.tw.rpi.edu/raw/319/index.rdf" target="_blank" href="http://data-gov.tw.rpi.edu/raw/319/index.rdf">http://data-gov.tw.rpi.edu/raw/319/index.rdf</a></td>
</tr>
</tbody>
</table>
<p>This report was picked solely because it had a small number of attributes (properties), and is thus easier to screen capture. The summary report on the wiki is shown by this <a rel="nofollow" target="_blank" href="http://data-gov.tw.rpi.edu/wiki/Dataset_319">page</a>:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_wiki_dataset_319.png"> <img class="center_ok" style="border:0px solid;width:600px;height:611px;" title="Click to expand" src="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_wiki_dataset_319.png" alt="Data-gov-Wiki Dataset #319" width="1093" height="1113"/></a>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>So, we see that this specific dataset contains about 22 of the nearly 8,000 attributes across all datasets.</p>
<p>When we click on one of these attribute names, we are then taken to a specific wiki page that only reiterates its label. There is no definition or explanation.</p>
<p>When we inspect this page further we see that, other than the broad characterization of the dataset itself (the bulk of the page), we see at the bottom 22 undefined attributes with labels such as <span style="font-style:italic;">item code</span>, <span style="font-style:italic;">periodicity code</span>, <span style="font-style:italic;">seasonal</span>, and the like. These attributes are the real structural basis for the data in this dataset.</p>
<p>But, what does all of this mean???</p>
<p>To gain a clue, now let&#8217;s go to the source data.gov site for this <a rel="nofollow" target="_blank" href="http://www.data.gov/details/319">dataset (#319)</a>. Here is how that report looks:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_data_gov_319.png"> <img class="center_ok" style="border:0px solid;width:600px;height:1146px;" title="Click to expand" src="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_data_gov_319.png" alt="Data.gov Dataset #319" width="1036" height="1978"/></a>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>Contained within this report we see a listing for additional <a rel="nofollow" target="_blank" href="ftp://ftp.bls.gov/pub/time.series/cx/cx.txt">metadata</a>. This link tells us about the various data fields contained in this dataset; we see many of these attributes are &#8220;codes&#8221; to various data categories.</p>
<p>Probing further into the dataset&#8217;s <a rel="nofollow" target="_blank" href="http://www.bls.gov/cex/">technical documentation</a>, we see that there is indeed a rich structure underneath this report, again provided via various code lookups. There are codes for geography, seasonality (adjusted or not), consumer demographic profiles and a variety of consumption categories. (See, for example, the link to this <a rel="nofollow" target="_blank" href="http://www.bls.gov/cex/csxgloss.htm">glossary page</a>.) These are the keys to understanding the actual values within this dataset.</p>
<p>For example, one major dimension of the data is captured by the attribute <span style="font-style:italic;">item_code</span>. The survey breaks down consumption expenditures within the broad categories of Food, Housing, Apparel and Services, Transportation, Health Care, Entertainment, and Other. Within a category, there is also a rich structural breakdown. For example, expenditures for Bakery Products within Food is given a <a rel="nofollow" target="_blank" href="ftp://ftp.bls.gov/pub/time.series/cx/cx.item">code</a> of FHC2.</p>
<p>But, nowhere are these codes defined or unlocked in the RDF datasets. This absence is true for virtually all of the datasets exposed on this wiki.</p>
<p>So, for literally billions of triples, and 8,000 attributes, we have <span style="font-weight:bold;">ABSOLUTELY NO INFORMATION ABOUT WHAT THE DATA CONTAINS OTHER THAN A PROPERTY LABEL</span>. There is much, much rich value here in data.gov, but all of it remains locked up and hidden.</p>
<p>The sad truth about this data release is that it provides absolutely no value in its current form. We lack the keys to unlock the value.</p>
<p>To be sure, early essential spade work has been done here to begin putting in place the conversion infrastructure for moving text files, spreadsheets and the like to an RDF form. This is yeoman work important to ultimate access. But, until a <span style="font-weight:bold;font-style:italic;">vocabulary</span> is published that defines the attributes and their codes so we can unlock this value, it will remain hidden. And only when its further value (by connecting attributes and relations across datasets) through a <span style="font-weight:bold;font-style:italic;">schema</span> of some nature is also published, the real value from connecting the dots will also remain hidden.<img style="width:160px;height:218px;float:right;margin-left:10px;" title="The Hustler" src="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_the_hustler.jpg" alt="The Hustler" align="right"/></p>
<p>These datasets may meet the partial conditions of providing clickable URLs, but the crucial &#8220;useful information&#8221; as to what any of this data means is absent.</p>
<p>Every single dataset on data.gov has supporting references to text files, PDFs, Web pages or the like that describe the nature of the data within each dataset. Until that information is exposed and made usable, we have no linked data.</p>
<p>Until ontologies get created from these technical documents, the value of these data instances remain locked up, and no value can be created from having these datasets expressed in RDF.</p>
<p>The devil lies in the details. The essential hard work has not yet begun.</p>
<h4>The NYT Case</h4>
<p>Though at a much smaller scale with many fewer attributes, the <a rel="nofollow" target="_blank" href="http://data.nytimes.com/">NYT dataset</a> suffers from the same failing: it too lacks a <span style="font-weight:bold;font-style:italic;">vocabulary</span>.</p>
<p>So, let&#8217;s take the case of one of the lead actors in <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://en.wikipedia.org/wiki/The_Hustler_%28film%29">The Hustler</a>, Paul Newman, who played the role of &#8220;Fast Eddie&#8221; Felson. Here is the <a rel="nofollow" target="_blank" href="http://data.nytimes.com/N31738445835662083893.html">NYT record</a> for the &#8220;person&#8221; <span style="font-style:italic;">Paul Newman</span> (which they also refer to as <a rel="nofollow" target="_blank" href="http://data.nytimes.com/newman_paul_per">http://data.nytimes.com/newman_paul_per</a>). Note the header title of <span style="font-weight:bold;">Newman, Paul</span>:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_nyt_paul_newman.png"> <img class="center_ok" style="border:0px solid;width:600px;height:593px;" title="Click to expand" src="http://mkbergman.com/wp-content/themes/ai3/images/2009Posts/091115_nyt_paul_newman.png" alt="NYT 'Paul Newman Articles' Record" width="988" height="976"/></a>
<p><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>Click on any of the internal labels used by the NYT for its own attributes (such as <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a>), and you will be given this message:</p>
<div style="margin-left:40px;">
<p><span style="font-style:italic;">&#8220;An RDFS description and English language documentation for the NYT namespace will be provided soon. Thanks for your patience.&#8221;</span></div>
<p>We again have no idea what is meant by all of this data except for the labels used for its attributes. In this case for <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a> we have a value of &#8220;2001-03-18&#8243;.</p>
<p>Hello? What? What is a &#8220;first use&#8221; for a &#8220;Paul Newman&#8221; of &#8220;2001-03-18&#8243;???</p>
<p>The NYT put the cart before the horse: even if minimal, they should have released their ontology first — or at least at the same time — as they released their data instances. (See further <a rel="nofollow"> this discussion</a> about how an ontology creation workflow can be incremental by starting simple and then upgrading as needed.)</p>
<h3>Links to Other Things</h3>
<p>Since there really are no links to other things on the Data-Gov Wiki, our focus in this section continues with the NYT dataset using our same example.</p>
<p>We now are in the territory of the fourth &#8220;rule&#8221; of linked data: <span style="font-style:italic;">4. Include links to other URIs so that they can discover more things</span>.</p>
<p>This will seem a bit basic at first, but before we can talk about linking to other things, we first need to understand and define the starting &#8220;thing&#8221; to which we are linking.</p>
<h4>What is a &#8220;Newman, Paul&#8221; Thing?</h4>
<p>Of course, without its own vocabulary, we are left to deduce what this thing &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8220; <span class="double_u">is</span> that is shown in the previous screen shot. Our first clue comes from the statement that it is of <span style="font-style:italic;">rdf:type</span> <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/skos-reference/">SKOS</a> <span style="font-style:italic;">concept</span>. By looking to the SKOS vocabulary, we see that <a rel="nofollow" target="_blank" href="http://www.w3.org/TR/skos-reference/#concepts"><span style="font-style:italic;">concept</span></a> is a class and is defined as:</p>
<p style="margin-left:40px;font-style:italic;">A SKOS concept can be viewed as an idea or notion; a unit of thought. However, what constitutes a unit of thought is subjective, and this definition is meant to be suggestive, rather than restrictive. The notion of a SKOS concept is useful when describing the conceptual or intellectual structure of a knowledge organization system, and when referring to specific ideas or meanings established within a KOS.</p>
<p>We also see that this instance is given a <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/0.1/primaryTopic">foaf:primaryTopic</a> of <span style="font-style:italic;">Paul Newman</span>.</p>
<p>So, we can deduce so far that this instance is about the concept or idea of <span style="font-style:italic;">Paul Newman</span>. Now, looking to the attributes of this instance — that is the defining properties provided by the NYT — we see the properties of <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/associated_article_count">nyt:associated_article_count</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/last_use">nyt:last_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/topicPage">nyt:topicPage</a>. Completing our deductions, and in the absence of its own vocabulary, we can now define this concept instance somewhat as follows:</p>
<p style="margin-left:40px;"><span style="font-style:italic;">New York Times articles in the period 2001 to 2009 having as their primary topic the actor Paul Newman</span></p>
<p>(BTW, across all records in this dataset, we could see what the earliest first use was to better deduce the time period over which these articles have been assembled, but that has not been done.)</p>
<p>We also would re-title this instance more akin to &#8220;2001-2009 NYT Articles with a Primary Topic of Paul Newman&#8221; or some such and use URIs more akin to this usage.</p>
<h4>sameAs Woes</h4>
<p>Thus, in order to make links or connections with other data, it is essential to understand what the nature is of the subject &#8220;thing&#8221; at hand. There is much confusion about actual &#8220;things&#8221; and the references to &#8220;things&#8221; and what is the nature of a &#8220;thing&#8221; within the literature and on mailing lists.</p>
<p>Our belief and usage in matters of the semantic Web is that all &#8220;things&#8221; we deal with are a reference to whatever the &#8220;true&#8221;, actual thing is. The question then becomes: What is the nature (or scope) of this referent?</p>
<p>There are actually quite easy ways to determine this nature. First, look to one or more instance examples of the &#8220;thing&#8221; being referred to. In our case above, we have the &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8221; instance record. Then, look to the properties (or attributes) the publisher of that record has used to describe that thing. Again, in the case above, we have <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/associated_article_count">nyt:associated_article_count</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a>, <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/latest_use">nyt:last_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/topicPage">nyt:topicPage</a>.</p>
<p>Clearly, this instance record — that is, its nature — deals with articles or groups of articles. The relation to <span style="font-style:italic;">Paul Newman</span> occurs as a basis of the <span class="double_u">primary topic</span> of these articles, and not a <span class="double_u">person</span> basis for which to describe the instance. If the nature of the instance was indeed the person <span style="font-style:italic;">Paul Newman</span>, then the attributes of the record would more properly be related to &#8220;person&#8221; properties such as age, sex, birthdate, death date, marital status, etc.</p>
<p>This confusion by NYT as to the nature of the &#8220;things&#8221; they are describing then leads to some very serious errors. By confusing the topic (<span style="font-style:italic;">Paul Newman</span>) of a record with the nature of that record (articles about topics), NYT next misuses one of the most powerful semantic Web predicates available, <span style="font-weight:bold;">owl:sameAs</span>.</p>
<p>By asserting in the &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8221; record that the instance has a <span style="font-weight:bold;">sameAs</span> relationship with external records in <a rel="nofollow" target="_blank" href="http://rdf.freebase.com/ns/en.paul_newman">Freebase</a> and <a rel="nofollow" target="_blank" href="http://dbpedia.org/resource/Paul_Newman">DBpedia</a>, the NYT both <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Entailment">entail</a>s that properties from any of the associated records are shared and <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Inference">infers</a> a chain of other types to describe the record. More precisely, the NYT is asserting that the &#8220;thing&#8221; referred to by these instances are <strong>identical</strong> resources.</p>
<p>Thus, by the <span style="font-weight:bold;">sameA</span>s statements in the &#8220;<span style="font-weight:bold;">Newman, Paul</span>&#8221; record, the NYT is also asserting that that record is an instance of all these things <a rel="nofollow" href="#id5">[5]</a>:</p>
<table border="0">
<tbody>
<tr>
<td></td>
<td>
<ul>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/about/html/http://www.w3.org/2002/07/owl%23Thing"> owl:Thing</a></li>
<li> <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/spec/#term_Agent">foaf:Agent</a></li>
<li> <a rel="nofollow" target="_blank" href="http://xmlns.com/foaf/spec/#term_Person">foaf:Person</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/ontology/Actor">dbpedia-owl:Actor</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/JewishActors">http://dbpedia.org/class/yago/JewishActors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/PeopleFromCleveland,Ohio">http://dbpedia.org/class/yago/PeopleFromCleveland,Ohio</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/ontology/Artist">dbpedia-owl:Artist</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/ontology/Person">dbpedia-owl:Person</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/Person100007846">http://dbpedia.org/class/yago/Person100007846</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/AmericanFilmDirectors">http://dbpedia.org/class/yago/AmericanFilmDirectors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/YaleUniversityAlumni">http://dbpedia.org/class/yago/YaleUniversityAlumni</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/OhioUniversityAlumni">http://dbpedia.org/class/yago/OhioUniversityAlumni</a></li>
<li> <a rel="nofollow" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjWoZwpEbGdrcN5Y29ycA"> opencyc:en/MaleHuman</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/AmericanFilmActors">http://dbpedia.org/class/yago/AmericanFilmActors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/Liberals">http://dbpedia.org/class/yago/Liberals</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/OhioActors">http://dbpedia.org/class/yago/OhioActors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/UnitedStatesNavySailors">http://dbpedia.org/class/yago/UnitedStatesNavySailors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/PeopleFromWestport,Connecticut"> http://dbpedia.org/class/yago/PeopleFromWestport,Connecticut</a></li>
<li> <a rel="nofollow" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rwQB4UJwpEbGdrcN5Y29ycA"> opencyc:en/JewishPerson</a></li>
<li> <a rel="nofollow" target="_blank" href="http://sw.opencyc.org/2008/06/10/concept/Mx4rwMRyTJwpEbGdrcN5Y29ycA"> opencyc:en/ActorInMovies</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/LivingPeople">http://dbpedia.org/class/yago/LivingPeople</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/Actor109765278">http://dbpedia.org/class/yago/Actor109765278</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/AmericanVegetarians">http://dbpedia.org/class/yago/AmericanVegetarians</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/AmericanPhilanthropists">http://dbpedia.org/class/yago/AmericanPhilanthropists</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/KenyonCollegeAlumni">http://dbpedia.org/class/yago/KenyonCollegeAlumni</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/WesternFilmActors">http://dbpedia.org/class/yago/WesternFilmActors</a></li>
<li> <a rel="nofollow" target="_blank" href="http://dbpedia.org/class/yago/ActorsStudioAlumni">http://dbpedia.org/class/yago/ActorsStudioAlumni</a></li>
<li>and, a hundred other dbpedia_yago superClasses.</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>Furthermore, because of its strong, reciprocal entailments, the <span style="font-weight:bold;">owl:sameAs</span> assertion would also now entail that the person <span style="font-style:italic;">Paul Newman</span> has the <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/first_use">nyt:first_use</a> and <a rel="nofollow" target="_blank" href="http://data.nytimes.com/elements/latest_use">nyt:last_use</a> attributes, clearly illogical for a &#8220;person&#8221; thing.</p>
<p>This connection is clearly wrong in both directions. <span style="font-style:italic;">Articles</span> are not <span style="font-style:italic;">persons</span> and don&#8217;t have <span style="font-style:italic;">marital status</span>; and <span style="font-style:italic;">persons</span> do not have <span style="font-style:italic;">first_uses</span>. By misapplying this <span style="font-weight:bold;">sameAs</span> linkage relationship, we have screwed things up in every which way. And the error began with misunderstanding what kinds of &#8220;things&#8221; our data is about.</p>
<h4>Some Options</h4>
<p>However, there are solutions. First, the <span style="font-weight:bold;">sameAs</span> assertions, at least involving these external resources, should be dropped.</p>
<p>Second, if linkages are still desired, a vocabulary such as <a rel="nofollow" target="_blank" href="http://umbel.org/">UMBEL</a> <a rel="nofollow" href="#ld4">[4]</a> could be used to make an assertion between such a concept, and these other related resources. So, even though these resources are not the same, they are <strong>closely</strong> related. The UMBEL ontology helps us to define this kind of relation between related, but non-identical, resources.</p>
<p>Instead of using the <span style="font-weight:bold;">owl:sameAs</span> property, we would suggest the usage of the <span style="font-weight:bold;">umbel:linksEntity</span>, which links a <span style="font-weight:bold;">skos:Concept</span> to related named entities resources. Additionally, Freebase, which also currently asserts a <span style="font-weight:bold;">sameAs</span> relationship to the NYT resource, could use the <span style="font-weight:bold;">umbel:isAbout</span> relationship to assert that their resource &#8220;is about&#8221; a certain concept, which is the one defined by the NYT.</p>
<p>Alternatively, still other external vocabularies that more precisely capture the intent of the NYT publishers could be found, or the NYT editors could define their own properties specifically addressing their unique linkage interests.</p>
<h4>Other Minor Issues</h4>
<p>As a couple of additional, minor suggestions for the NYT dataset, we would suggest:</p>
<ul>
<li>Create a <span style="font-weight:bold;">foaf:Organization</span> description of the NYT organization, then use it with <span style="font-weight:bold;">dc:creator</span> and <span style="font-weight:bold;">dcterms:rightsHolder</span> rather than using a literal, and</li>
<li>The dual URIs such as &#8220;<a rel="nofollow" target="_blank" href="http://data.nytimes.com/N31738445835662083893">http://data.nytimes.com/N31738445835662083893</a>&#8221; and &#8220;<a rel="nofollow" target="_blank" href="http://data.nytimes.com/newman_paul_per">http://data.nytimes.com/newman_paul_per</a>&#8221; are not wrong in themselves, but the purpose is hard to understand. Why does a single organization need to create multiple resources for the <strong>identical resource,</strong> when it comes from the same system and has the same purpose?</li>
</ul>
<h4>Re-visiting the Linkage &#8220;Rule&#8221;</h4>
<p>There are very valuable benefits from entailment, inference and logic to be gained from linking resources. However, if the nature of the &#8220;things&#8221; being linked — or the properties that define these linkages — are incorrect, then very wrong logical implications result. Great care and understanding should be applied to linkage assertions.</p>
<h3>In the End, the Challenge is Not Linked Data, but <span style="font-style:italic;text-decoration:underline;">Connected</span> Data</h3>
<p>Our critical comments are not meant to be disrespectful and are not being picky. The NYT and TWC are prominent institutions for which we should expect leadership on these issues. Our criticisms (and we believe those of others) are also not an expression of a &#8220;<a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Hype_cycle">trough of disillusionment</a>&#8221; as <a rel="nofollow" target="_blank" href="http://twitter.com/gregboutin/status/5558525462">some</a> have been pointing out.</p>
<div class="boxYellowDotted" style="margin:0pt 0pt 0pt 10px;float:right;width:300px;text-align:center;">This posting has been jointly authored by <a rel="nofollow" target="_blank" href="http://mkbergman.com/"> Mike Bergman</a> and <a rel="nofollow" target="_blank" href="http://fgiasson.com/blog">Fred Giasson</a> and simultaneously published on both of their blogs, hoping to draw more attention to the need for better practices in publishing linked data.</div>
<p>This posting is about poor practices, pure and simple. The time to correct them is now. If asked, we would be pleased to help either institution establish exemplar practices. This is not automatic, and it is not always easy. The data.gov datasets, in particular, will require much time and effort to get right. There is much documentation that needs to be transitioned and expressed in semantic Web formats.</p>
<p>In a broader sense, we also seem to lack a definition of best practices related to <span style="font-weight:bold;">vocabularies</span>, <span style="font-weight:bold;">schema</span> and <span style="font-weight:bold;">mappings</span>. The Berners-Lee rules are imprecise and insufficient as is. Prior best guidance documents tend to be more how to publish and make URIs linkable, than to properly characterize, describe and connect the data.</p>
<p>Perhaps, in part, this is a bit of a semantics issue. The challenge is not the mechanics of <span style="font-style:italic;">linking data</span>, but the meaning and basis for <span class="double_u">connecting</span> that data. Connections require logic and rationality sufficient to reliably inform inference and rule-based engines. It also needs to pass the sniff test as we &#8220;follow our nose&#8221; by clicking the links exposed by the data.</p>
<p>It is exciting to see high-quality content such as from national governments and major publishers like the New York Times begin to be exposed as linked data. When this content finally gets embedded into usable contexts, we should see manifest uses and benefits emerge. We hope both institutions take our criticisms in that spirit.</p>
<hr style="margin:15px 0px;" size="1"/>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld1" name="ld1"></a> [1] The NYT has been updated with improvements and they fixed multiple issues from the first release. The problems listed herein, however, still pertain after these improvements.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld2" name="ld2"></a> [2] Tim Berners-Lee, 2006. Linked Data (Design Issues), first posted on 2006-07-27; last updated on 2009-06-18. See <a rel="nofollow" target="_blank" href="http://www.w3.org/DesignIssues/LinkedData.html">http://www.w3.org/DesignIssues/LinkedData.html</a>. Berners-Lee refers to the steps above as &#8220;rules,&#8221; but he elaborates they are expectations of behavior. Most later citations refer to these as &#8220;principles.&#8221;</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld3" name="ld3"></a> [3] Li Ding, Dominic DiFranzo, Sarah Magidson, Deborah L. McGuinness and Jim Hendler, 2009. Data-GovWiki: Towards Linked Government Data. See <a rel="nofollow" target="_blank" href="http://www.cs.vu.nl/%7Epmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf"> http://www.cs.vu.nl/~pmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf</a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="ld4" name="ld4"></a> [4] UMBEL <em>(Upper Mapping and Binding Exchange Layer)</em> is a lightweight ontology structure in development for relating Web content and data to a standard set of subject concepts. It purpose has resulted in its creation of an associated vocabulary geared to both class-instance and reciprocal relationships, as well as partial or likelihood relationships. See <a rel="nofollow" target="_blank" href="http://umbel.org/technical_documentation.html#vocabulary">http://umbel.org/technical_documentation.html#vocabulary</a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" name="id5"></a>[5] We&#8217;d like to thank Denny Vrandecic (see comments) for pointing out an imprecision in our original wording. This phrase was originally stated as, &#8220;Thus, by the sameAs statements in the &#8216;Newman, Paul&#8217; record, the NYT is also asserting that that record is the same as these other things.&#8221;<em> </em></div>
<img src="http://feeds.feedburner.com/~r/AI3_AdaptiveInformation/~4/LdAkKAJGeOY" height="1" width="1"/>]]></content:encoded>
      </item>
      <item>
         <title>Wikipedia infobox template coherence</title>
         <link>http://ebiquity.umbc.edu/blogger/2009/11/15/wikipedia-infobox-template-coherence/</link>
         <description>Wikipedia has an interesting RFC on approaches to achieve and maintain better coherence in its infobox templates. This is significant because Wikipedia is becoming the new CYC &amp;#8212; a broad, practical KB filled with general purpose background knowledge. The RFC was kicked off by discussions on dbpedia template annotations. The RFC defines [...]</description>
         <guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=2695</guid>
         <pubDate>Sun, 15 Nov 2009 09:29:02 -0800</pubDate>
         <content:encoded><![CDATA[<p>Wikipedia has an interesting <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/infobox_template_coherence">RFC</a> on approaches to achieve and maintain better coherence in its infobox templates. This is significant because Wikipedia is becoming the new CYC &#8212; a broad, practical KB filled with general purpose background knowledge. The RFC was kicked off by discussions on <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#DBpedia_Template_Annotations">dbpedia template annotations</a>. The RFC defines the problem as:</p>
<blockquote><p> &#8220;Wikipedia uses hundreds of infobox templates for describing various entity types like NFL teams, schools in Canada, train stations etc. These infoboxes are separated and do not use a common vocabulary. Several different spellings of attributes are used for them, which all stand for the same meaning (e.g. birth_place, birthPlace, origin). This poses limitations to checking consistency within Wikipedia infoboxes, amongst different language editions, and it makes it hard for external tools to reuse the information in infoboxes.&#8221;</p></blockquote>
<p>The goals mentioned in the RFC include (1) establishing the currently missing links between synonymous template attributes, (2) enabling authors to use template annotations to check for for factual inconsistencies (e.g., outdated population figures), and (3) providing consensus about which properties should be used in templates and what data they should contain.</p>]]></content:encoded>
      </item>
      <item>
         <title>If there were no flowers, then we would die</title>
         <link>http://www.thewebsemantic.com/2009/11/14/if-there-were-no-flowers-then-we-would-die/</link>
         <description>My daughter made this for me today. Notice the invitation to user generated content&amp;#8230;free-for-all flowers!</description>
         <guid isPermaLink="false">http://www.thewebsemantic.com/?p=180</guid>
         <pubDate>Sat, 14 Nov 2009 14:16:28 -0800</pubDate>
      </item>
      <item>
         <title>Exposing Government Data at the ISWC</title>
         <link>http://dallemang.typepad.com/my_weblog/2009/11/exposing-government-data-at-the-iswc.html</link>
         <description>One of the goals of the organizers of the International Semantic Web Conference this year was to expose government information managers and contractors to the Semantic Web. To some extent this succeeded; there were a number of attendees who came...</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_211ee6df5c53194f7768b3a189ccced0</guid>
         <pubDate>Fri, 13 Nov 2009 07:06:41 -0800</pubDate>
         <content:encoded><![CDATA[One of the goals of the organizers of the <a rel="nofollow">International Semantic Web Conference</a> this year was to expose government information managers and contractors to the Semantic Web. To some extent this succeeded; there were a number of attendees who came to the conference from government agencies. On the other hand, it failed; many of them were already Semantic Web enthusiasts, so there was some element of preaching to the choir. <p>
I felt that my own major session, a tutorial on building semantic web applications for government data, was very successful in this regard. I was originally disappointed with the registration, but discovered on the day that most people signed up for 'tutorials' in general, then attended whatever they liked. The room made it to SRO before the first coffee break, and many of the people there were from government agencies or contractors, as desired (many were from other places, but I'm not going to complain about that!).
</p><p>
Probably the biggest measure of success for this goal was the exposure in the <a rel="nofollow" target="_blank" href="http://gcn.com/Home.aspx">Government Computer News</a>. Senior Technology Editor Joab Jackson seemed to like my elevator pitch about the Semantic Web (though it only works for veerrryyy slloooww elevators) enough to repeat it in <a rel="nofollow" target="_blank" href="http://gcn.com/Articles/2009/11/09/Linked-Government-Data-feature.aspx?Page=4">one of his articles about the event</a>. In <a rel="nofollow" target="_blank" href="http://gcn.com/Articles/2009/11/09/Linked-Government-Data-sidebar-Gvmt-Examples.aspx?Page=1">another article</a>, Joab told me something I didn't know - that the report we generated in the tutorial is actually interesting to government IT managers, and would somewhat labor intensive without linked open government data.
</p><p>
GCN's <a rel="nofollow" target="_blank" href="http://gcn.com/articles/2009/11/09/linked-government-data-sidebar-01-resources.aspx?sc_lang=en">13 resources</a> seems like an intentional flaunting of superstition, since one could easily come up with many more. I am flattered that <a rel="nofollow" target="_blank" href="http://topquadrant.com/resources/SemWebGovtExercise.html">one of my own pages</a> made it to the list; many of the omissions&#0160; are available there, and include <a rel="nofollow" target="_blank" href="http://usgovxml.com/">US Gov XML</a> and <a rel="nofollow" target="_blank" href="http://oegov.org/">OEgov</a>. </p><p>All in all (thanks to a great extent to Mr. Jackson's efforts), I think we managed to achieve some exposure for semantic web technology for government information managers. </p>]]></content:encoded>
      </item>
      <item>
         <title>SuRF 1.0.0 Beta</title>
         <link>http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=592</link>
         <description>&lt;h4&gt;SuRF 1.0.0 Beta &lt;/h4&gt;
We are pleased to announce release of SuRF 1.0.0 Beta. This version includes some significant changes and improvements in interface, thus the major version number shift.
SuRF is an Object - RDF Mapper based on the popular rdflib python library. It exposes RDF triple sets as sets of resources and integrates them into the Object Oriented paradigm of Python in a similar manner as the ActiveRDF does for Ruby.
New features in 1.0.0 Beta version: &lt;ul&gt;&lt;li&gt;Improved resource querying. Can mix any of these features together:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;filter resources by attribute values&lt;/li&gt;&lt;li&gt;filter resources using SPARQL filter expressions&lt;/li&gt;&lt;li&gt;limit, offset, order ascending/descending&lt;/li&gt;&lt;li&gt;specify graph/context where resources should be loaded from and later saved to&lt;/li&gt;&lt;li&gt;eager-load resource attributes&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Improved attribute querying. All the querying features available at resource level are also available at attribute level.&lt;/li&gt;&lt;li&gt;Growing amount of documentation and examples. Still big gaps there but the situation is improving.&lt;/li&gt;&lt;/ul&gt;
Project Google Code site: http://code.google.com/p/surfrdf/ &lt;br /&gt; Documentation: http://packages.python.org/SuRF/
You are very welcome to try it out, tell us about your experiences, report bugs and participate!
contact: Peteris Caune</description>
         <guid isPermaLink="false">http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=592</guid>
         <pubDate>Fri, 13 Nov 2009 09:28:08 -0800</pubDate>
         <category>DERI Outreach</category>
      </item>
      <item>
         <title>This Week at DERI</title>
         <link>http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=593</link>
         <description>&lt;h4&gt;Open Your mind : Lunchtimes Seminar Series&lt;/h4&gt; Initiative organised collectively by the Outreach officers of REMEDI, NCBES, Applied Optics, ECI &amp;amp; DERI. &lt;img width=&quot;328&quot; height=&quot;465&quot; alt=&quot;&quot;/&gt; &lt;h4&gt; Release of SuRF version 1.0.0 Beta &lt;/h4&gt; Read the announcement in the previous post 
&lt;h4&gt; TPAC 2009&lt;/h4&gt;
The World Wide Web Consortium TPAC 2009 took place at Santa Clara, California, USA from 2 November - 6 November.
&quot;The Combined Technical Plenary / Advisory Committee Meetings Week brings together W3C Working and Interest Groups, the Advisory Board, the TAG and the Advisory Committee for an exciting week of coordinated work. The highlight of the week is the Plenary Day, Wednesday, 4 November, for all to attend.&quot;
 Visit http://www.w3.org/2009/11/TPAC/PlenaryAgenda  and find slides, pdf, demos of the following sessions:
&lt;ul&gt;&lt;li&gt; Decentralized Extensibility in HTML5&lt;/li&gt;&lt;li&gt; Maintaining a Healthy Internet Ecosystem -- Challenges to an Open Internet Infrastructure&lt;/li&gt;&lt;li&gt; Lightning Talks (I)&lt;/li&gt;&lt;ul&gt;&lt;li&gt;DCCI, by Rotan Hanrahan (MobileAware)&lt;/li&gt;&lt;li&gt;Rich Web Application XG Report, by Steven Pemberton (W3C)&lt;/li&gt;&lt;li&gt;Opera Unite - a Web server for your whole family, by Charles McCathieNevile (Opera Software)&lt;/li&gt;&lt;li&gt;W3C cheatsheet for developers, by Dominique Hazaël-Massieux (W3C)&lt;/li&gt;&lt;li&gt;Semantic Web in the Oil &amp;amp; Gas Industry, by Roger Cutler (Chevron)&lt;/li&gt;&lt;li&gt;United we(b and net) stand!, by Arnaud de Moissac (SFR)&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Privacy on the Web of Applications -- Challenges and Opportunities&lt;/li&gt;&lt;li&gt;Web Apps vs App. Stores&lt;/li&gt;&lt;li&gt;Future of the Social Web&lt;/li&gt;&lt;li&gt; Lightning Talks (II)&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Multimodality in Enterprise Applications, by Raj Tumuluri (Openstream) and Tom Underhill (Microsoft)&lt;/li&gt;&lt;li&gt;If MacGyver was a spec editor: simple tools, unbelievable results, by Marcos Caceres (Opera Software)&lt;/li&gt;&lt;li&gt;ReSpec.js  A Fresh Specification-Writing Tool, by Robin Berjon (Robineko)&lt;/li&gt;&lt;li&gt;Privacy and Data Governance, by Rigo Wenning (W3C) &lt;/li&gt;&lt;li&gt;XML Test Assertions on Steroids - with TAMElizer, by Jacques Durand (Fujitsu)&lt;/li&gt;&lt;li&gt;The End of the Beginning, by Daniel Glazman (Disruptive Innovations).&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;</description>
         <guid isPermaLink="false">http://blog.deri.ie/index.php?id=452&amp;no_cache=1&amp;tx_ttnews[tt_news]=593</guid>
         <pubDate>Fri, 13 Nov 2009 09:23:46 -0800</pubDate>
         <category>The Venice Project</category>
      </item>
      <item>
         <title>Charlene Li – The Impact of Social Media in your organisation</title>
         <link>http://feedproxy.google.com/~r/Nodalities/~3/7tqzQ-xz3hw/charlene-li-the-impact-of-social-media-in-your-organisation.php</link>
         <description>The Opening Keynote slot on day three of Online Information 2009 will be filled by Charlene Li, formerly of Forrester Research and now founder of Altimer Group.&amp;#160; Named &amp;#34;One of the Most Influential Women in Technology&amp;#34; by Fast Company magazine, Charlene, co-author of Groundswell, is well know for her opinions on how social [...]</description>
         <guid isPermaLink="false">JsAx1rN13BGzogf56UjTQA_57f0e10aa0152a8a3e07d3bada7bf135</guid>
         <pubDate>Thu, 12 Nov 2009 20:48:41 -0800</pubDate>
         <content:encoded/>
      </item>
      <item>
         <title>The Pedantic Web Group to the RDF Rescue!</title>
         <link>http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</link>
         <description>&lt;p&gt;&lt;strong&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/Jennifer-Zaino-profile.html&quot;&gt;Jennifer Zaino&lt;/a&gt;&lt;/strong&gt; &lt;br /&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;&lt;/p&gt; &lt;p&gt;How’s your RDF? If it could be in better shape, some folks may be able to help: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://pedantic-web.org/&quot;&gt;Pedantic Web Group&lt;/a&gt; was recently formed by researchers at &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/news/irish_eyes_are_smiling_on_semantic_web_research_138984.asp&quot;&gt;Digital Enterprise Research Institute&lt;/a&gt; (DERI) and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.aifb.uni-karlsruhe.de/&quot;&gt;Institute AIFB&lt;/a&gt; at the Universitaet Karlsruhe (see &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/researchers_focus_on_semantic_web_incentives_138924.asp&quot;&gt;previous article&lt;/a&gt;). &lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;Aidan_Hogan.jpg&quot; src=&quot;http://www.semanticweb.com/original/Aidan_Hogan.jpg&quot; width=&quot;94&quot; height=&quot;124&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;To find out more about the new initiative, &lt;em&gt;SemanticWeb.com&lt;/em&gt; recently conducted an email conversation with one of the group’s founders, Aidan Hogan, who is working on the URQ research stream at DERI, a work program to find the right trade-off between expressive knowledge representation and efficient, scalable reasoning and querying techniques in the open, distributed environment of the Semantic Web.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;SemanticWeb.com:&lt;/strong&gt; Have you and your colleagues observed a growing trend of RDF data being published? What might that lead you to conclude about the growing maturity of the semantic web?&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Hogan:&lt;/strong&gt; There is certainly an encouraging trend of growth in RDF data -- both in terms of quality, heterogeneity and quantity -- being published to the Web. Back in 2005 when I started working in the area of Semantic Web research, RDF Web data consisted of a number of interlinked FOAF profiles and some data published under the auspices of various research projects or geek curiosity. The quality of the data was, as I remember, quite poor. Publishers were reluctant to use URIs to name their resources, vocabularies were replete with errors, interlinking between datasets was either poor or nonexistent. My own FOAF file, at that time, was no different; they were certainly more innocent times.&lt;/p&gt; &lt;p&gt;Jump to late 2009 and we've come a long way. More specifically, under the pragmatic guidance of the Linked Data movement, RDF data published on the Web has come a long way. The Linked Data movement has been integral to the maturation of RDF Web publishing, not merely by promoting a set of pragmatic best practices, but also by refocusing efforts on producing data: before, data was often published in RDF as an afterthought, or for the purposes of a specific application. Now, Linked Data advocates publishing RDF data on the Web as a worthwhile endeavor in itself. As such, we are now on our way to solving the chicken-and-egg problem with the Semantic Web with respect to which comes first: the data or the applications. &lt;/p&gt; &lt;p&gt;Now is an exciting time for R&amp;D into applications which can exploit the fruits of Linked Data. As compared to four or five years ago, data quality has improved as, for example, publishers understand the importance of using URIs to name things, and that those URIs should be dereferencable. Quantity and heterogeneity has also increased, as the March '09 LOD cloud [refer ttp://linkeddata.org] can attest to; data is being published by governmental and commercial entities and is becoming more 'general-interest'. And, the trend is continuing; for example, there were two exciting announcements at ISWC last week: that &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/phase2_wants_to_push_some_semantic_buttons_138935.asp&quot;&gt;Drupal 7 core&lt;/a&gt; will support &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/insight/siocing_the_semantic_web_138836.asp&quot;&gt;SIOC/RDFa&lt;/a&gt; exports by default and that the &lt;em&gt;New York Times&lt;/em&gt; are planning to produce Linked Data exports. &lt;/p&gt; &lt;p class=&quot;continued&quot;&gt;&lt;a rel=&quot;nofollow&quot; class=&quot;continued&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp#more&quot;&gt;continued...&lt;/a&gt;&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</guid>
         <pubDate>Fri, 13 Nov 2009 03:59:24 -0800</pubDate>
         <category>Features</category>
         <enclosure length="4805" url="http://www.semanticweb.com/original/Aidan_Hogan.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>The Pedantic Web Group to the RDF Rescue!</title>
         <link>http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</link>
         <description>&lt;p&gt;&lt;strong&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/Jennifer-Zaino-profile.html&quot;&gt;Jennifer Zaino&lt;/a&gt;&lt;/strong&gt; &lt;br /&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;&lt;/p&gt; &lt;p&gt;How’s your RDF? If it could be in better shape, some folks may be able to help: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://pedantic-web.org/&quot;&gt;Pedantic Web Group&lt;/a&gt; was recently formed by researchers at &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/news/irish_eyes_are_smiling_on_semantic_web_research_138984.asp&quot;&gt;Digital Enterprise Research Institute&lt;/a&gt; (DERI) and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.aifb.uni-karlsruhe.de/&quot;&gt;Institute AIFB&lt;/a&gt; at the Universitaet Karlsruhe (see &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/researchers_focus_on_semantic_web_incentives_138924.asp&quot;&gt;previous article&lt;/a&gt;). &lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;Aidan_Hogan.jpg&quot; src=&quot;http://www.semanticweb.com/original/Aidan_Hogan.jpg&quot; width=&quot;94&quot; height=&quot;124&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;To find out more about the new initiative, &lt;em&gt;SemanticWeb.com&lt;/em&gt; recently conducted an email conversation with one of the group’s founders, Aidan Hogan, who is working on the URQ research stream at DERI, a work program to find the right trade-off between expressive knowledge representation and efficient, scalable reasoning and querying techniques in the open, distributed environment of the Semantic Web.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;SemanticWeb.com:&lt;/strong&gt; Have you and your colleagues observed a growing trend of RDF data being published? What might that lead you to conclude about the growing maturity of the semantic web?&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Hogan:&lt;/strong&gt; There is certainly an encouraging trend of growth in RDF data -- both in terms of quality, heterogeneity and quantity -- being published to the Web. Back in 2005 when I started working in the area of Semantic Web research, RDF Web data consisted of a number of interlinked FOAF profiles and some data published under the auspices of various research projects or geek curiosity. The quality of the data was, as I remember, quite poor. Publishers were reluctant to use URIs to name their resources, vocabularies were replete with errors, interlinking between datasets was either poor or nonexistent. My own FOAF file, at that time, was no different; they were certainly more innocent times.&lt;/p&gt; &lt;p&gt;Jump to late 2009 and we've come a long way. More specifically, under the pragmatic guidance of the Linked Data movement, RDF data published on the Web has come a long way. The Linked Data movement has been integral to the maturation of RDF Web publishing, not merely by promoting a set of pragmatic best practices, but also by refocusing efforts on producing data: before, data was often published in RDF as an afterthought, or for the purposes of a specific application. Now, Linked Data advocates publishing RDF data on the Web as a worthwhile endeavor in itself. As such, we are now on our way to solving the chicken-and-egg problem with the Semantic Web with respect to which comes first: the data or the applications. &lt;/p&gt; &lt;p&gt;Now is an exciting time for R&amp;D into applications which can exploit the fruits of Linked Data. As compared to four or five years ago, data quality has improved as, for example, publishers understand the importance of using URIs to name things, and that those URIs should be dereferencable. Quantity and heterogeneity has also increased, as the March '09 LOD cloud [refer ttp://linkeddata.org] can attest to; data is being published by governmental and commercial entities and is becoming more 'general-interest'. And, the trend is continuing; for example, there were two exciting announcements at ISWC last week: that &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/phase2_wants_to_push_some_semantic_buttons_138935.asp&quot;&gt;Drupal 7 core&lt;/a&gt; will support &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/insight/siocing_the_semantic_web_138836.asp&quot;&gt;SIOC/RDFa&lt;/a&gt; exports by default and that the &lt;em&gt;New York Times&lt;/em&gt; are planning to produce Linked Data exports. &lt;/p&gt; &lt;p class=&quot;continued&quot;&gt;&lt;a rel=&quot;nofollow&quot; class=&quot;continued&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp#more&quot;&gt;continued...&lt;/a&gt;&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</guid>
         <pubDate>Fri, 13 Nov 2009 03:59:24 -0800</pubDate>
         <category>Features</category>
         <enclosure length="4805" url="http://www.semanticweb.com/original/Aidan_Hogan.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>The Pedantic Web Group to the RDF Rescue!</title>
         <link>http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</link>
         <description>&lt;p&gt;&lt;strong&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/Jennifer-Zaino-profile.html&quot;&gt;Jennifer Zaino&lt;/a&gt;&lt;/strong&gt; &lt;br /&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;&lt;/p&gt; &lt;p&gt;How’s your RDF? If it could be in better shape, some folks may be able to help: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://pedantic-web.org/&quot;&gt;Pedantic Web Group&lt;/a&gt; was recently formed by researchers at &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/news/irish_eyes_are_smiling_on_semantic_web_research_138984.asp&quot;&gt;Digital Enterprise Research Institute&lt;/a&gt; (DERI) and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.aifb.uni-karlsruhe.de/&quot;&gt;Institute AIFB&lt;/a&gt; at the Universitaet Karlsruhe (see &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/researchers_focus_on_semantic_web_incentives_138924.asp&quot;&gt;previous article&lt;/a&gt;). &lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;Aidan_Hogan.jpg&quot; src=&quot;http://www.semanticweb.com/original/Aidan_Hogan.jpg&quot; width=&quot;94&quot; height=&quot;124&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;To find out more about the new initiative, &lt;em&gt;SemanticWeb.com&lt;/em&gt; recently conducted an email conversation with one of the group’s founders, Aidan Hogan, who is working on the URQ research stream at DERI, a work program to find the right trade-off between expressive knowledge representation and efficient, scalable reasoning and querying techniques in the open, distributed environment of the Semantic Web.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;SemanticWeb.com:&lt;/strong&gt; Have you and your colleagues observed a growing trend of RDF data being published? What might that lead you to conclude about the growing maturity of the semantic web?&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Hogan:&lt;/strong&gt; There is certainly an encouraging trend of growth in RDF data -- both in terms of quality, heterogeneity and quantity -- being published to the Web. Back in 2005 when I started working in the area of Semantic Web research, RDF Web data consisted of a number of interlinked FOAF profiles and some data published under the auspices of various research projects or geek curiosity. The quality of the data was, as I remember, quite poor. Publishers were reluctant to use URIs to name their resources, vocabularies were replete with errors, interlinking between datasets was either poor or nonexistent. My own FOAF file, at that time, was no different; they were certainly more innocent times.&lt;/p&gt; &lt;p&gt;Jump to late 2009 and we've come a long way. More specifically, under the pragmatic guidance of the Linked Data movement, RDF data published on the Web has come a long way. The Linked Data movement has been integral to the maturation of RDF Web publishing, not merely by promoting a set of pragmatic best practices, but also by refocusing efforts on producing data: before, data was often published in RDF as an afterthought, or for the purposes of a specific application. Now, Linked Data advocates publishing RDF data on the Web as a worthwhile endeavor in itself. As such, we are now on our way to solving the chicken-and-egg problem with the Semantic Web with respect to which comes first: the data or the applications. &lt;/p&gt; &lt;p&gt;Now is an exciting time for R&amp;D into applications which can exploit the fruits of Linked Data. As compared to four or five years ago, data quality has improved as, for example, publishers understand the importance of using URIs to name things, and that those URIs should be dereferencable. Quantity and heterogeneity has also increased, as the March '09 LOD cloud [refer ttp://linkeddata.org] can attest to; data is being published by governmental and commercial entities and is becoming more 'general-interest'. And, the trend is continuing; for example, there were two exciting announcements at ISWC last week: that &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/main/phase2_wants_to_push_some_semantic_buttons_138935.asp&quot;&gt;Drupal 7 core&lt;/a&gt; will support &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/insight/siocing_the_semantic_web_138836.asp&quot;&gt;SIOC/RDFa&lt;/a&gt; exports by default and that the &lt;em&gt;New York Times&lt;/em&gt; are planning to produce Linked Data exports. &lt;/p&gt; &lt;p class=&quot;continued&quot;&gt;&lt;a rel=&quot;nofollow&quot; class=&quot;continued&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp#more&quot;&gt;continued...&lt;/a&gt;&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/the_pedantic_web_group_to_the_rdf_rescue_143103.asp?c=rss</guid>
         <pubDate>Fri, 13 Nov 2009 03:59:24 -0800</pubDate>
         <category>Features</category>
         <enclosure length="4805" url="http://www.semanticweb.com/original/Aidan_Hogan.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>Marissa Mayer's Take On Search - Mediapost.com</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/zPVRVr3looY/industry-news-marissa-mayers-take-search-mediapostcom.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;img src=&quot;http://nt3.ggpht.com/news/tbn/G6QtWf4CyHAADM/6.jpg&quot; alt=&quot;&quot; border=&quot;1&quot; width=&quot;80&quot; height=&quot;80&quot;/&gt;&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=zPVRVr3looY:UO525QBjQEg:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=zPVRVr3looY:UO525QBjQEg:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=zPVRVr3looY:UO525QBjQEg:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/zPVRVr3looY&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4691 at http://www.semanticuniverse.com</guid>
         <pubDate>Thu, 12 Nov 2009 11:12:48 -0800</pubDate>
      </item>
      <item>
         <title>Pay to be free…</title>
         <link>http://ivan-herman.name/2009/11/12/pay-to-be-free%e2%80%a6/</link>
         <description>I may not be well informed, so this may be a known approach for some of you, but it is the first time I see this…
There has been a tension between (scientific) publishers and authors for a while on whether one is allowed to put one&amp;#8217;s publication on the Web. When dealing with traditional publishers [...]&lt;img alt=&quot;&quot; border=&quot;0&quot; src=&quot;http://stats.wordpress.com/b.gif?host=ivan-herman.name&amp;blog=557157&amp;post=466&amp;subd=ivanherman&amp;ref=&amp;feed=1&quot;/&gt;</description>
         <guid isPermaLink="false">http://ivan-herman.name/?p=466</guid>
         <pubDate>Thu, 12 Nov 2009 08:00:52 -0800</pubDate>
         <content:encoded><![CDATA[<div class='snap_preview'><br /><p>I may not be well informed, so this may be a known approach for some of you, but it is the first time I see this…</p>
<p>There has been a tension between (scientific) publishers and authors for a while on whether one is allowed to put one&#8217;s publication on the Web. When dealing with traditional publishers the author usually gives away his/her copyright and the papers are rarely available on the Web (which is a source of constant frustrations to readers). Fortunately, this is not always the case; for example, the proceedings of the World Wide Web conference series are published by ACM, but the papers are nevertheless available on the Web for free (thanks to <a rel="nofollow" target="_blank" href="http://www.iw3c2.org">IW3C2</a>).</p>
<p>Well, a counter-proposal from a publisher is quite amazing. A Hungarian publisher, <a rel="nofollow" target="_blank" href="http://www.akkrt.hu">Akadémiai Kiadó</a>, offers authors a deal, called the <a rel="nofollow" target="_blank" href="http://www.akkrt.hu/main.php?folderID=1720&amp;articleID=4583&amp;ctag=articlelist&amp;iid=1&amp;accessible=">“Optional Open Article</a>”: if you pay the nice sum of 900€, then the paper is also put onto an on line edition and is made freely available on the Web. (The fact that it is then freely available is clear in the <a rel="nofollow" target="_blank" href="http://www.akkrt.hu/pdf/OOpenArt_License_Agreement.pdf">agreement posted on the web site</a>). Pay for your freedom. Isn’t this wonderful?</p>
<p>And, to make it clear: this <em>is</em> a very prestigious publisher in Hungary, is related to the Hungarian Academy of Sciences and, therefore, the prime publishers locally of Hungarian scientists…</p>
<p>I find it appalling. But this may only be me.</p>
Posted in Social aspects, Work Related Tagged: publishing <a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/gocomments/ivanherman.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ivanherman.wordpress.com/466/"/></a> <a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/godelicious/ivanherman.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ivanherman.wordpress.com/466/"/></a> <a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/gostumble/ivanherman.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ivanherman.wordpress.com/466/"/></a> <a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/godigg/ivanherman.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ivanherman.wordpress.com/466/"/></a> <a rel="nofollow" target="_blank" href="http://feeds.wordpress.com/1.0/goreddit/ivanherman.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ivanherman.wordpress.com/466/"/></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ivan-herman.name&blog=557157&post=466&subd=ivanherman&ref=&feed=1"/></div>]]></content:encoded>
         <media:content url="http://0.gravatar.com/avatar/ee636fa218fc08a28db5288c2149e309?s=96&amp;amp;d=identicon" medium="image">
            <media:title>ivanherman</media:title>
         </media:content>
      </item>
      <item>
         <title>Early Detection Cancer Research Collaboration: Canary Foundation, GenoLogics ... - Eworldwire (press release)</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/SnrJiju5HNY/industry-news-early-detection-cancer-research-collaboration-canary-foundation-genologics-eworldwire-</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=SnrJiju5HNY:4im1yxqAJHo:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=SnrJiju5HNY:4im1yxqAJHo:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=SnrJiju5HNY:4im1yxqAJHo:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/SnrJiju5HNY&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4684 at http://www.semanticuniverse.com</guid>
         <pubDate>Thu, 12 Nov 2009 07:03:51 -0800</pubDate>
      </item>
      <item>
         <title>Bing Hopes to Get Search Bang From Wolfram Alpha</title>
         <link>http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</link>
         <description>&lt;p&gt;Microsoft's Bing search engine has generated a lot of buzz since its debut in June, grabbing almost 10 percent of the search market. &lt;/p&gt; &lt;p&gt;Now Bing hopes to get to the next level by integrating &quot;knowledge engine&quot; Wolfram Alpha into its search results. Microsoft announced a partnership with Wolfram Alpha that will include health, nutrition and math data in Bing search results.&lt;/p&gt; &lt;p&gt;(The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://blog.wolframalpha.com/2009/11/11/microsoft%E2%80%99s-bing-introducing-one-of-wolframalpha%E2%80%99s-first-commercial-api-customers/&quot;&gt;Wolfram Alpha blog&lt;/a&gt; discusses the partnership, as does the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.bing.com/community/blogs/search/archive/2009/11/11/how-many-calories-in-a-burger-what-s-2-2-2-2-2-bing-and-wolfram-alpha-have-the-answers.aspx&quot;&gt;Bing blog&lt;/a&gt;.)&lt;/p&gt; &lt;p&gt;The &lt;em&gt;New York Times&lt;/em&gt;'s Bits blog explains how it works:&lt;/p&gt; &lt;blockquote&gt;When users type a food item like &quot;chicken breast&quot; into Bing, the results will include a box showing the nutritional information for it. Bing users will also be able to have access to a body-mass index calculator or to plot certain formulas on a graph.&lt;/blockquote&gt; &lt;p&gt;Here's a screenshot of what a Bing search for BMI calculator would turn up:&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;2018.bmi.png&quot; src=&quot;http://www.semanticweb.com/original/2018.bmi.png&quot; width=&quot;576&quot; height=&quot;240&quot;/&gt;&lt;br /&gt;
&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</guid>
         <pubDate>Wed, 11 Nov 2009 20:34:36 -0800</pubDate>
         <category>News</category>
         <enclosure length="112330" url="http://www.semanticweb.com/original/2018.bmi.png" type="image/png"/>
      </item>
      <item>
         <title>Bing Hopes to Get Search Bang From Wolfram Alpha</title>
         <link>http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</link>
         <description>&lt;p&gt;Microsoft's Bing search engine has generated a lot of buzz since its debut in June, grabbing almost 10 percent of the search market. &lt;/p&gt; &lt;p&gt;Now Bing hopes to get to the next level by integrating &quot;knowledge engine&quot; Wolfram Alpha into its search results. Microsoft announced a partnership with Wolfram Alpha that will include health, nutrition and math data in Bing search results.&lt;/p&gt; &lt;p&gt;(The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://blog.wolframalpha.com/2009/11/11/microsoft%E2%80%99s-bing-introducing-one-of-wolframalpha%E2%80%99s-first-commercial-api-customers/&quot;&gt;Wolfram Alpha blog&lt;/a&gt; discusses the partnership, as does the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.bing.com/community/blogs/search/archive/2009/11/11/how-many-calories-in-a-burger-what-s-2-2-2-2-2-bing-and-wolfram-alpha-have-the-answers.aspx&quot;&gt;Bing blog&lt;/a&gt;.)&lt;/p&gt; &lt;p&gt;The &lt;em&gt;New York Times&lt;/em&gt;'s Bits blog explains how it works:&lt;/p&gt; &lt;blockquote&gt;When users type a food item like &quot;chicken breast&quot; into Bing, the results will include a box showing the nutritional information for it. Bing users will also be able to have access to a body-mass index calculator or to plot certain formulas on a graph.&lt;/blockquote&gt; &lt;p&gt;Here's a screenshot of what a Bing search for BMI calculator would turn up:&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;2018.bmi.png&quot; src=&quot;http://www.semanticweb.com/original/2018.bmi.png&quot; width=&quot;576&quot; height=&quot;240&quot;/&gt;&lt;br /&gt;
&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</guid>
         <pubDate>Wed, 11 Nov 2009 20:34:36 -0800</pubDate>
         <category>News</category>
         <enclosure length="112330" url="http://www.semanticweb.com/original/2018.bmi.png" type="image/png"/>
      </item>
      <item>
         <title>Bing Hopes to Get Search Bang From Wolfram Alpha</title>
         <link>http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</link>
         <description>&lt;p&gt;Microsoft's Bing search engine has generated a lot of buzz since its debut in June, grabbing almost 10 percent of the search market. &lt;/p&gt; &lt;p&gt;Now Bing hopes to get to the next level by integrating &quot;knowledge engine&quot; Wolfram Alpha into its search results. Microsoft announced a partnership with Wolfram Alpha that will include health, nutrition and math data in Bing search results.&lt;/p&gt; &lt;p&gt;(The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://blog.wolframalpha.com/2009/11/11/microsoft%E2%80%99s-bing-introducing-one-of-wolframalpha%E2%80%99s-first-commercial-api-customers/&quot;&gt;Wolfram Alpha blog&lt;/a&gt; discusses the partnership, as does the &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.bing.com/community/blogs/search/archive/2009/11/11/how-many-calories-in-a-burger-what-s-2-2-2-2-2-bing-and-wolfram-alpha-have-the-answers.aspx&quot;&gt;Bing blog&lt;/a&gt;.)&lt;/p&gt; &lt;p&gt;The &lt;em&gt;New York Times&lt;/em&gt;'s Bits blog explains how it works:&lt;/p&gt; &lt;blockquote&gt;When users type a food item like &quot;chicken breast&quot; into Bing, the results will include a box showing the nutritional information for it. Bing users will also be able to have access to a body-mass index calculator or to plot certain formulas on a graph.&lt;/blockquote&gt; &lt;p&gt;Here's a screenshot of what a Bing search for BMI calculator would turn up:&lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;2018.bmi.png&quot; src=&quot;http://www.semanticweb.com/original/2018.bmi.png&quot; width=&quot;576&quot; height=&quot;240&quot;/&gt;&lt;br /&gt;
&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/news/bing_hopes_to_get_search_bang_from_wolfram_alpha_143042.asp?c=rss</guid>
         <pubDate>Wed, 11 Nov 2009 20:34:36 -0800</pubDate>
         <category>News</category>
         <enclosure length="112330" url="http://www.semanticweb.com/original/2018.bmi.png" type="image/png"/>
      </item>
      <item>
         <title>A Most un-commON Way to Author Datasets</title>
         <link>http://feedproxy.google.com/~r/AI3_AdaptiveInformation/~3/47ERL9M6Je8/</link>
         <description>&lt;span class=&quot;Z3988&quot; title=&quot;ctx_ver=Z39.88-2004&amp;amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;amp;rft.title=A Most un-commON Way to Author Datasets&amp;amp;rft.aulast=Bergman&amp;amp;rft.aufirst=Mike&amp;amp;rft.subject=Adaptive Information&amp;amp;rft.subject=Ontologies&amp;amp;rft.subject=Semantic Web&amp;amp;rft.subject=Semantic Web Tools&amp;amp;rft.subject=Structured Dynamics&amp;amp;rft.subject=Structured Web&amp;amp;rft.subject=irON&amp;amp;rft.source=AI3:::Adaptive Information&amp;amp;rft.date=2009-11-11&amp;amp;rft.type=blogPost&amp;amp;rft.format=text&amp;amp;rft.identifier=http://www.mkbergman.com/845/a-most-un-common-way-to-author-datasets/&amp;amp;rft.language=English&quot;&gt;&lt;/span&gt;A Case Study of Turning Spreadsheets into Structured Data Powerhouses
In a former life, I had the nickname of &amp;#8216;Spreadsheet King&amp;#8217; (perhaps among others that I did not care to hear). I had gotten the nick because of my aggressive [...]</description>
         <guid isPermaLink="false">http://www.mkbergman.com/?p=845</guid>
         <pubDate>Wed, 11 Nov 2009 18:19:54 -0800</pubDate>
         <content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=A Most un-commON Way to Author Datasets&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Ontologies&amp;rft.subject=Semantic Web&amp;rft.subject=Semantic Web Tools&amp;rft.subject=Structured Dynamics&amp;rft.subject=Structured Web&amp;rft.subject=irON&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2009-11-11&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/845/a-most-un-common-way-to-author-datasets/&amp;rft.language=English"></span>
<p><a rel="nofollow" target="_blank" href="http://openstructs.org/iron"><img style="border:0px solid;width:235px;height:125px;float:left;margin-right:10px;" title="irON - instance record and Object Notation" alt="irON - instance record and Object Notation" hspace="5" vspace="5" align="left"/></a></p>
<h2>A Case Study of Turning Spreadsheets into Structured Data Powerhouses</h2>
<p>In a former life, I had the nickname of &#8216;Spreadsheet King&#8217; (perhaps among others that I did not care to hear). I had gotten the nick because of my aggressive use of spreadsheets for financial models, competitors tracking, time series analyses, and the like. However, in all honesty, I have encountered many others in my career much more knowledgeable and capable with spreadsheets than I&#8217;ll ever be. So, maybe I was really more like a minor duke or a court jester than true nobility.</p>
<p>Yet, pro or amateur, there are perhaps 1 billion spreadsheet users worldwide <a rel="nofollow" href="#commON1">[1]</a>, making spreadsheets undoubtedly the most prevalent data authoring environment in existence. And, despite moans and wails about how spreadsheets can lead to chaos, spaghetti code, or violations of internal standards, they are here to stay.</p>
<p>Spreadsheets often begin as simple notetaking environments. With the addition of new findings and more analysis, some of these worksheets may evolve to become full-blown datasets. Alternatively, some spreadsheets start from Day One as intended datasets or modeling environments. Whatever the case, clearly there is much accumulated information and data value &#8220;locked up&#8221; in existing spreadsheets.</p>
<p>How to &#8220;unlock&#8221; this value for sharing and collaboration was a major stimulus to development of the <span style="font-weight:bold;">commON</span> serialization of <span style="font-weight:bold;">irON</span> (<span style="font-style:italic;">instance record</span> and <span style="font-style:italic;">Object Notation</span>) <a rel="nofollow" href="#commON2">[2]</a>. I recently published a <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/common-swt-annex">case study</a> <a rel="nofollow" href="#commON3">[3]</a> that describes the reasons and benefits of dataset authoring in a spreadsheet, and provides working examples and code based on <span style="font-style:italic;">Sweet Tools</span> <a rel="nofollow" href="#commON4">[4]</a> to aid users in understanding and using the <span style="font-weight:bold;">commON</span> notation. I summarize portions of that study herein.</p>
<div class="boxGreenDotted" style="margin:5px 0pt 5px 10px;width:240px;float:right;text-align:center;">This is the second article of a two-part series related to the recent <span style="font-style:italic;">Sweet Tools</span> <a rel="nofollow">update</a>.</div>
<h3>Background on <span style="font-style:italic;">Sweet Tools</span> and irON</h3>
<p>The dataset that is the focus of this <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/common-swt-annex">use case</a>, <a rel="nofollow"><span style="font-style:italic;">Sweet Tools</span></a>, began as an informal tracking spreadsheet about four years ago. I began it as a way to learn about available tools in the semantic Web and -related spaces. I began publishing it and others found it of value so I continued to develop it.</p>
<p>As it grew over time, however, it gained in structure and size. Eventually, it became a reference dataset, with which many other people desired to use and interact. The current version has well over 800 tools listed, characterized by many structured data attributes such as type, programming language, description and so forth. As it has grown, a formal controlled vocabulary has also evolved to bring consistency to the characterization of many of these attributes.</p>
<p>It was natural for me to maintain this listing as a spreadsheet, which was also reinforced when I was one of the first to adopt an <a rel="nofollow">Exhibit presentation</a> of the data based on a Google spreadsheet about three years back. Here is a partial view of this spreadsheet as I maintain it locally:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/images/swt_main_screen.png"> <img class="center_ok" style="border:0px solid;width:740px;height:356px;" title="Click to expand" src="http://openstructs.org/sites/openstructs.org/files/images/swt_main_screen.png" alt="Sweet Tools Main Spreadsheet Screen" width="1279" height="615"/></a><br />
<span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>When we began to develop <span style="font-weight:bold;">irON</span> in earnest as a simple (&#8221;naïve&#8221;) dataset authoring framework, it was clear that a comma-separated value, or <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Comma-separated_values">CSV</a> <a rel="nofollow" href="#commON5">[5]</a>, option should join the other two serializations under consideration, XML and JSON. CSV, though less expressive and capable as a data format than the other serializations, still has an <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Attribute-value_pair">attribute-value pair</a> (also known as key-value pairs and many other variants <a rel="nofollow" href="#commON6">[6]</a>) orientation. And, via spreadsheets, datasets can be easily authored and inspected, while also providing a rich functional environment including sorting, formatting, data validation, calculations, macros, etc.</p>
<p>As a dataset very familiar to us as <span style="font-weight:bold;">irON</span>&#8217;s editors, and directly relevant to the semantic Web, <span style="font-style:italic;">Sweet Tools</span> provided a perfect prototype case study for helping to guide the development of <span style="font-weight:bold;">irON</span>, and specifically what came to be known as the <span style="font-weight:bold;">commON</span> serialization for <span style="font-weight:bold;">irON</span>. The <span style="font-style:italic;">Sweet Tools</span> dataset is relatively large for a speciality source, has many different types and attributes, and is characterized by text, images, URLs and similar.</p>
<p>The premise was that if <span style="font-style:italic;">Sweet Tools</span> could be specified and represented in <span style="font-weight:bold;">commON</span> sufficiently to be parsed and converted to interoperable RDF, then many similar instance-oriented datasets could likely be so as well. Thus, as we tried and refined notation and vocabulary, we tested applicability against the CSV representation of <span style="font-style:italic;">Sweet Tools</span> in addition to other CSV, JSON and XML datasets.</p>
<h3>Dataset Authoring in a Spreadsheet</h3>
<p>A large portion of the <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/common-swt-annex">case study</a> describes the many advantages of authoring small datasets within spreadsheets. The useful thing about the CSV format is that these full functional capabilities of the spreadsheet are available during authoring or later updates and modifications, but, when exported, the CSV provides a relatively clean format for processing and parsing.</p>
<p>So, some of the reasons for small dataset authoring in a spreadsheet include:</p>
<ul>
<li> <span style="font-style:italic;">Formatting and on-sheet management</span> - the first usefulness of a spreadsheet comes from being able to format and organize the records. Records can be given background colors to highlight distinctions (new entries, for example); live URL links can be embedded; contents can be wrapped and styled within cells; and the column and row heads can be &#8220;frozen&#8221;, useful when scrolling large workspaces</li>
<li> <span style="font-style:italic;">Named blocks and sorting</span> &#8211; named blocks are a powerful feature of modern spreadsheets, useful for data manipulation, printing and internal referencing by formulas and the like. Sorting with named blocks is especially important as an aid to check consistency of terminology, records completeness, duplicates checks, missing value checks, and the like. Named blocks can also be used as references in calculations. All of these features are real time savers, especially when datasets grow large and consistency of treatment and terminology is important</li>
<li> <span style="font-style:italic;">Multiple sheets and consolidated access</span> &#8211; <span style="font-weight:bold;">commON</span> modules can be specified on a single worksheet or multiple worksheets and saved as individual CSV files; because of its size and relative complexity, the <span style="font-style:italic;">Sweet Tools</span> dataset is maintained on multiple sheets. Multi-worksheet environments help keep related data and notes consolidated and more easily managed on local hard drives</li>
<li> <span style="font-style:italic;">Completeness and counts</span> - the spreadsheet <span style="font-style:italic;">counta</span> function is useful to sum counts for cell entries by both column and row, a useful aid to indicate if an attribute or type value is missing or if a record is incomplete. Of course, similar helps and uses can be found for many of the hundreds of embedded functions within a spreadsheet</li>
<li> <span style="font-style:italic;">Controlled vocabularies and data entry validation</span> &#8211; quality datasets often hinge on consistency and uniform values and terminology; the data validation utilities within spreadsheets can be applied to Boolean, ranges and mins and maxes, and to controlled vocabulary lists. Here is an example for <span style="font-style:italic;">Sweet Tools</span>, enforcing proper tool category assignments from a 50-item pick list:</li>
</ul>
<div style="margin:10px;"><img class="center_ok" style="border:0px solid;width:609px;height:373px;" title="Controlled Vocabularies and Data Entry Validation" src="http://openstructs.org/sites/openstructs.org/files/images/swt_validation.png" alt="Controlled Vocabularies and Data Entry Validation" width="609" height="373"/></div>
<ul>
<li> <span style="font-style:italic;">Specialized functions and macros</span> &#8211; <span>all</span> functionality of spreadsheets may be employed in the development of <span style="font-weight:bold;">commON</span> datasets. Then, once employed, only the values embedded within the sheets are then exported as CSV.</li>
</ul>
<h3>Staging <span style="font-style:italic;">Sweet Tools</span> for commON</h3>
<p>The next major section of the <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/common-swt-annex">case study</a> deals with the minor conventions that must be followed in order to stage spreadsheets for <span style="font-weight:bold;">commON</span>. Not much of the specific <span style="font-weight:bold;">commON</span> vocabulary or notation is discussed below; for details, see <a rel="nofollow" href="#commON7">[7]</a>.</p>
<p>Because you can create multiple worksheets within a spreadsheet, it is not necessary to modifiy existing worksheets or tabs. Rather, if you are reluctant or can not change existing information, merely create parallel duplicate sheets of the source information. These duplicate sheets have as their sole purpose export to <span style="font-weight:bold;">commON</span> CSV. You can maintain your spreadsheet as is while staging for <span style="font-weight:bold;">commON</span>.</p>
<p>To do so, use the simple <span style="font-style:italic;">=</span> formula to create cross-references between the existing source spreadsheet tab and the target <span style="font-weight:bold;">commON</span> CSV export tab. (You can also do this for complete, highlighted blocks from source to target sheet.) Then, by adding the few minor conventions of <span style="font-weight:bold;">commON</span>, you have now created a staged export tab without modifying your source information in the slightest.</p>
<p>In standard form and for Excel and Open Office, single quotes, double quotes and commas when entered into a spreadsheet cell are automatically &#8216;<a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Escape_character">escaped</a>&#8216; when issued as CSV. <span style="font-weight:bold;">commON</span> allows you to specify your own delimiter for lists (the standard is the pipe &#8216;|&#8217; character) and what the parser recognizes as the &#8216;escape&#8217; character (&#8217;&#92;&#8217; is the standard). However, you probably should not change for most conditions.</p>
<p>The standard <span style="font-weight:bold;">commON</span> parsers and converters are UTF-8 compatible. If your source content has unusual encodings, try to target UTF-8 as your canonical spreadsheet output.</p>
<p>In the <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/iron-specification"><span style="font-weight:bold;">irON</span> specification</a> there are a small number of defined modules or processing sections. In <span style="font-weight:bold;">commON</span>, these modules are denoted by the double-ampersand character sequence (&#8217;<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;</span>&#8216;), and apply to lists of instance records (<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span>), dataset specifications and associated metadata describing the dataset (<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;dataset</span>), and mappings of attributes and types to existing schema (<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;linkage</span>). Similarly, attributes and types are denoted by a single ampersand prefix (<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;attributeName</span>).</p>
<p>In <span style="font-weight:bold;">commON</span>, any or all of the modules can occur within a single CSV file or in multiple files. In any case, the start of one of these processing modules is signaled by the module keyword and <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;keyword</span> convention.</p>
<h4>The RecordList Module</h4>
<p>The first spreadsheet figure above shows a <span style="font-style:italic;">Sweet Tools</span> example for the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span> module. The module begins with that keyword, indicating one of more instance records will follow. Note that the first line after the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span> keyword is devoted to the listing of attributes and types for the instance records (designated by the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;attributeName</span> convention in the columns for the first row after the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span> keyword is encountered).</p>
<p>The <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span> format can also include the <span style="font-style:italic;">stacked</span> style (see similar Dataset example below) in addition to the single <span style="font-style:italic;">row</span> style shown above.</p>
<p>At any rate, once a worksheet is ready with its instance records following the straightforward <span style="font-weight:bold;">irON</span> and <span style="font-weight:bold;">commON</span> conventions, it can then be saved as a CSV file and appropriately named. Here is an example of what this &#8220;vanilla&#8221; CSV file now looks like when shown again in a spreadsheet:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/images/swt_csv_spreadsheet_view.png"> <img class="center_ok" style="border:0px solid;width:740px;height:342px;" title="Click to expand" src="http://openstructs.org/sites/openstructs.org/files/images/swt_csv_spreadsheet_view.png" alt="Spreadsheet View of the CSV File" width="1271" height="587"/></a><span><br />
</span> <span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>Alternatively, you could open this same file in a text editor. Here is how this exact same instance record view looks in an editor:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/images/swt_csv_editor_view.png"> <img class="center_ok" style="border:0px solid;width:740px;height:389px;" title="Click to expand" src="http://openstructs.org/sites/openstructs.org/files/images/swt_csv_editor_view.png" alt="Editor View of the CSV Record File" width="1251" height="657"/></a><br />
<span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>Note that the CSV format separates each column by the comma separator, with escapes shown for the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;description</span> attribute when it includes a comma-separated clause. Without word wrap, each record in this format occupies a single row (though, again, for the <span style="font-style:italic;">stacked</span> style, multiple entries are allowed on individual rows so long as a new instance record <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;id</span> is not encountered in the first column).</p>
<h4>The Dataset Module</h4>
<p>The <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;dataset</span> module defines the dataset parameters and provides very flexible metadata attributes to describe the dataset <a rel="nofollow" href="#commON8">[8]</a>. Note the dataset specification is exactly equivalent in form to the instance record (<span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;recordList</span>) format, and also allows the single <span style="font-style:italic;">row</span> or <span style="font-style:italic;">stacked</span> styles (see these <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/iron-specification#mozTocId223991">instance record examples</a>), with this one being the <span style="font-style:italic;">stacked</span> style:</p>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/images/swt_dataset.png"> <img class="center_ok" style="border:0px solid;width:740px;height:105px;" title="Click to expand" src="http://openstructs.org/sites/openstructs.org/files/images/swt_dataset.png" alt="The Dataset Module" width="1579" height="223"/></a><br />
<span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<h4>The Linkage Module</h4>
<p>The <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;&amp;linkage</span> module is used to map the structure of the instance records to some structural schema, which can also include external ontologies. The module has a simple, but specific structure.</p>
<p>Either attributes (presented as the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;attributeList</span>) or types (presented as the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;typeList</span>) are listed sequentially by row until the listing is exhausted <a rel="nofollow" href="#commON8">[8]</a>. By convention, the second column in the listing is the targeted <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;mapTo</span> value. Absent a prior <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;prefixList</span> value, the <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">&amp;mapTo</span> value needs to be a full URL to the corresponding attribute or type in some external schema:</p>
<div style="margin:10px;"><img class="center_ok" style="border:0px solid;width:537px;height:595px;" title="The Linkage Module" src="http://openstructs.org/sites/openstructs.org/files/images/swt_linkage.png" alt="The Linkage Module" width="537" height="595"/></div>
<p>Notice in the case of <span style="font-style:italic;">Sweet Tools</span> that most values are from the actual COSMO mini-ontology underlying the listing. These need to be listed as well, since absent the specifications in <span style="font-weight:bold;">commON</span> the system has NO knowledge of linkages and mappings.</p>
<h4>The Schema (structure) Module</h4>
<p>In its current state of development, <span style="font-weight:bold;">commON</span> does not support a spreadsheet-based means for specifying the schema structure (lightweight ontology) governing the datasets <a rel="nofollow" href="#commON2">[2]</a>. Another <span style="font-weight:bold;">irON</span> serialization, <span style="font-weight:bold;">irJSON</span>, does. Either via this <span style="font-weight:bold;">irJSON</span> specification or via an offline ontology, a link reference is presently used by <span style="font-weight:bold;">commON</span> (and, therefore, <span style="font-style:italic;">Sweet Tools</span> for this case study) to establish the governing structure of the input instance record datasets.</p>
<p>A spreadsheet-based schema structure for <span style="font-weight:bold;">commON</span> has been designed and tested in prototype form. <span style="font-weight:bold;">commON</span> should be enhanced with this capability in the near future <a rel="nofollow" href="#commON8">[8]</a>.</p>
<h4>Saving and Importing</h4>
<p>If the modules are spread across more than one worksheet, then each worksheet must be saved as its own CSV file. In the case of <span style="font-style:italic;">Sweet Tools</span>, as exhibited by its reference current spreadsheet, <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">sweet_tools_20091110.xls</span>, three individual CSV files get saved. These files can be named whatever you would like. However, it is essential that the names be remembered for later referencing.</p>
<p>My own naming convention is to use a format of <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">appname_date_modulename.csv</span> because it sorts well in a file manager accommodating multiple versions (dates) and keeps related files clustered. The <span style="font-style:italic;">appname</span> in the case of <span style="font-style:italic;">Sweet Tools</span> is generally <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">swt</span>. The <span style="font-style:italic;">modulename</span> is generally the <span style="font-style:italic;">dataset</span>, <span style="font-style:italic;">records</span>, or <span style="font-style:italic;">linkage</span> convention. I tend to use the <span style="font-style:italic;">date</span> specification in the YYYYMMDD format. Thus, in the case of the <span style="font-style:italic;">records</span> listings for <span style="font-style:italic;">Sweet Tools</span>, its filename could be something like: <span style="font-family:Courier New, Courier, monospace;font-weight:bold;">swt_20091110_records.csv</span>.</p>
<p>Once saved, these files are now ready to be imported into a <span style="font-weight:bold;">structWSF</span> <a rel="nofollow" href="#commON9">[9]</a> instance, which is where the CSV parsing and conversion to interoperable RDF occurs<a rel="nofollow" href="#commON8"> [8]</a>. In this case study, we used the Drupal-based <span style="font-weight:bold;">conStruct SCS</span> system <a rel="nofollow" href="#commON10">[10]</a>. <span style="font-weight:bold;">conStruct</span> exposes the <span style="font-weight:bold;">structWSF</span> Web services via a user interface and a user permission and access system. The actual case study write-up offers more details about the import process.</p>
<h3>Using the Dataset</h3>
<p>We are now ready to interact with the <span style="font-style:italic;">Sweet Tools</span> structured dataset using <span style="font-weight:bold;">conStruct</span> (assuming you have a Drupal installation with the <span style="font-weight:bold;">conStruct</span> modules) <a rel="nofollow" href="#commON10">[10]</a>.</p>
<h4>Introduction to the App</h4>
<p>The screen capture below shows a couple of aspects of the system:</p>
<ul>
<li>First, the left hand panel (according to how this specific Drupal install was themed) shows the various tools available to <span style="font-weight:bold;">conStruct</span>. These include (with links to their documentation) <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/search">Search</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/browse">Browse</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/view-record">View Record</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/import">Import</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/export">Export</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/datasets"> Datasets</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/create-record">Create Record</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/update-record">Update Record</a>, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/delete-record">Delete Record</a> and <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/settings">Settings</a><a rel="nofollow" href="#commON11"> [11]</a>;</li>
<li>The Browse tree in the main part of the screen shows the full mini-ontology that classifies <span style="font-style:italic;">Sweet Tools</span>. Via simple inferencing, clicking on any parent link displays all children projects for that category as well <span style="font-style:italic;">(click to expand)</span>:</li>
</ul>
<div style="margin:10px;text-align:center;"><a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/images/swt_drupal_browse.png"> <img class="center_ok" style="border:0px solid;width:740px;height:1907px;" title="Click to expand" src="http://openstructs.org/sites/openstructs.org/files/images/swt_drupal_browse.png" alt="conStruct (Drupal) Browse Screen for Sweet Tools" width="1176" height="3031"/></a><span style="font-style:italic;font-size:90%;">(click to expand)</span></div>
<p>One of the absolutely cool things about this framework is that all tools, inferencing, user interfaces and data structure are a direct result of the ontology(ies) underlying the system (plus the <span style="font-weight:bold;">irON</span> instance ontology, as well). This means that switching datasets or adding datasets causes the entire system structure to now reflect those changes — without lifting a finger!!</p>
<h4>Some Sample Uses</h4>
<p>Here are a few sample things you can do with these generic tools driven by the <em>Sweet Tools</em> dataset:</p>
<ul>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/">Browsing the ontology tree</a> (then, Browse by Kind)</li>
<li>Viewing an <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Fswt%2Firon&amp;dataset=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F181%2F"> instance record</a></li>
<li>Viewing a <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/ontology/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23KRBrowser"> Class Type Report</a></li>
<li>Viewing an <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/ontology/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Firon%23description"> Attribute Report</a></li>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/search/?filter_types_3=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23KRBrowser&amp;filter_attributes_4=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23status&amp;query=new&amp;filter=on"> Searching by facet</a> (check the tabs)</li>
<li>Doing a <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/search/">multi-value filtering</a> (make selections from the various tabs),</li>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/export/">Exporting stuff</a> in a variety of formats.</li>
</ul>
<p>Note, if you access this <span style="font-weight:bold;">conStruct</span> instance you will do so as a <span style="font-style:italic;">demo</span> user. Unfortunately, as such, you may not be able to see all of the write and update tools, which in this case are reserved for curators or admins. Recall that <span style="font-weight:bold;">structWSF</span> has a comprehensive <a rel="nofollow"> user access and permissions layer</a>.</p>
<h4>Exporting in Alternative Formats</h4>
<p>Of course, one of the real advantages of the <span style="font-weight:bold;">irON</span> and <span style="font-weight:bold;">structWSF</span> designs is to enable different formats to be interchanged and to interoperate. Upon submission, the <span style="font-weight:bold;">commON</span> format and its datasets can then be exported in these alternate formats and serializations <a rel="nofollow" href="#commON8">[8]</a>:</p>
<ul>
<li>commON</li>
<li>irJSON</li>
<li>irXML</li>
<li>N-Triples/CSV</li>
<li>N-Triples/TSV</li>
<li>RDF+N3</li>
<li>RDF+XML</li>
</ul>
<p>As should be obvious, one of the real benefits of the <span style="font-weight:bold;">irON</span> notation &#8212; in addition to easy dataset authoring &#8212; is the ability to more-or-less treat RDF, CSV, XML and JSON as interoperable data formats.</p>
<h3>The Formal Case Study</h3>
<p>The formal <span style="font-style:italic;">Sweet Tools</span> case study based on <span style="font-weight:bold;">commON</span>, with sample download files and PDF, is available from <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://openstructs.org/iron/common-swt-annex">Annex: A commON Case Study using Sweet Tools, Supplementary Documentation</a> <a rel="nofollow" href="#commON3">[3]</a>.</p>
<hr style="margin:15px 0px;" size="1"/>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON1" name="commON1"></a> [1] In 2003, <a rel="nofollow" target="_blank" href="http://www.microsoft.com/presspass/press/2003/oct03/10-13vstoofficelaunchpr.mspx"> Microsoft estimated</a> its worldwide users of the Excel spreadsheet, which then had about a 90% market share globally, at 400 million. Others at that time estimated unauthorized use to perhaps double that amount. There has been significant growth since then, and online spreadsheets such as Google Docs and Zoho have also grown wildly. This surely puts spreadsheet users globally into the 1 billion range.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON2" name="commON2"></a> [2] See Frédérick Giasson and Michael Bergman, eds., <span style="font-style:italic;">Instance Record and Object Notation (irON) Specification, Specification Document</span>, version 0.82, 20 October 2009. See <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/iron-specification">http://openstructs.org/iron/iron-specification</a>. Also see the <a rel="nofollow" target="_blank" href="http://openstructs.org/iron"><span style="font-weight:bold;">irON</span> Web site</a>, Google <a rel="nofollow" target="_blank" href="http://groups.google.com/group/iron-notation">discussion group</a>, and <a rel="nofollow" target="_blank" href="http://code.google.com/p/iron-notation/">code distribution site</a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON3" name="commON3"></a> [3] Michael Bergman, 2009. <span style="font-style:italic;">Annex: A commON Case Study using Sweet Tools, Supplementary Documentation</span>, prepared by Structured Dynamics LLC, November 10, 2009. See <a rel="nofollow" target="_blank" href="http://openstructs.org/iron/common-swt-annex">http://openstructs.org/iron/common-swt-annex</a>. It may also be downloaded in PDF <a rel="nofollow" target="_blank" href="http://openstructs.org/sites/openstructs.org/files/downloads/common-case-study.pdf"> <img style="border:0px solid;width:13px;height:16px;" src="http://openstructs.org/sites/openstructs.org/files/icons/pdfdoc.gif" alt=""/></a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON4" name="commON4"></a> [4] See Michael K. Bergman&#8217;s <a rel="nofollow" target="_blank" href="http://mkbergman.com/">AI3:::Adaptive Information</a> blog, <a rel="nofollow"><span style="font-style:italic;"> Sweet Tools (Sem Web)</span></a>. In addition, the <span style="font-weight:bold;">commON</span> version of <span style="font-style:italic;">Sweet Tools</span> is available at the <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/?browse=true&amp;attribute=all&amp;type=all&amp;dataset=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F122%2F&amp;page=0"> <span style="font-weight:bold;">conStruct</span> site</a>.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON5" name="commON5"></a> [5] The CSV mime type is defined in <span style="font-style:italic;">Common Format and MIME Type for Comma-Separated Values (CSV) Files</span> [<a rel="nofollow" target="_blank" href="http://www.rfc-editor.org/rfc/rfc4180.txt">RFC 4180</a>]. A useful overview of the CSV format is provided by <a rel="nofollow" title="http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm" target="_blank" href="http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm">The Comma Separated Value (CSV) File Format</a>. Also, see that author&#8217;s related CTX reference for a discussion of how schema and structure can be added to the basic CSV framework; see <a rel="nofollow" target="_blank" href="http://www.creativyst.com/Doc/Std/ctx/ctx.htm">http://www.creativyst.com/Doc/Std/ctx/ctx.htm</a>, especially the section on the comma-delimited version (<a rel="nofollow" target="_blank" href="http://www.creativyst.com/Doc/Std/ctx/ctx.htm#CTC">http://www.creativyst.com/Doc/Std/ctx/ctx.htm#CTC</a>).</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON6" name="commON6"></a> [6] An <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Attribute-value_system">attribute-value system</a> is a basic knowledge representation framework comprising a table with columns designating &#8220;attributes&#8221; (also known as <span style="font-style:italic;">properties</span>, <span style="font-style:italic;">predicates</span>, <span style="font-style:italic;">features</span>, <span style="font-style:italic;">parameters</span>, <span style="font-style:italic;">dimensions</span>, <span style="font-style:italic;">characteristics</span> or <span style="font-style:italic;">independent variables</span>) and rows designating &#8220;objects&#8221; (also known as <span style="font-style:italic;">entities</span>, <span style="font-style:italic;">instances</span>, <span style="font-style:italic;">exemplars</span>, <span style="font-style:italic;">elements</span> or <span style="font-style:italic;">dependent variables</span>). Each table cell therefore designates the value (also known as <span style="font-style:italic;">state</span>) of a particular attribute of a particular object. This is the basic table presentation of a spreadsheet or relational data table.
<p>Attribute-values can also be presented as pairs in a form of an <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Associative_array">associative array</a>, where the first item listed is the attribute, often followed by a separator such as the colon, and then the value. JSON and many simple data struct notations follow this format. This format may also be called <span style="font-style:italic;">attribute-value pairs</span>, <span style="font-style:italic;">key-value pairs</span>, <span style="font-style:italic;">name-value pairs</span>, <span style="font-style:italic;">alists</span> or others. In these cases the &#8220;object&#8221; is implied, or is introduced as the name of the array..</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON7" name="commON7"></a> [7] See especially <a rel="nofollow" style="font-style:italic;" target="_blank" href="http://openstructs.org/iron/iron-specification#mozTocId603499">SUB-PART 3: commON PROFILE</a> in, Frédérick Giasson and Michael Bergman, eds., <span style="font-style:italic;">Instance Record and Object Notation (irON) Specification, Specification Document</span>, version 0.82, 20 October 2009.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON8" name="commON8"></a> [8] As of the date of this case study, some of the processing steps in the <span style="font-weight:bold;">commON</span> pipeline are manual. For example, the parser creates an intermediate N3 file that is actually submitted to the <span style="font-weight:bold;">structWSF</span>. Within a week or two of publication, these capabilities should be available as a direct import to a <span style="font-weight:bold;">structWSF</span> instance. However, there is one exception to this: the specification for the schema structure. That module has been prototyped, but will not be released with the first <span style="font-weight:bold;">commON</span> upgrade. That enhancement is likely a few weeks off from the date of this posting. Please check the <a rel="nofollow" target="_blank" href="http://groups.google.com/group/iron-notation"><span style="font-weight:bold;">irON</span></a> or <a rel="nofollow" style="font-weight:bold;" target="_blank" href="http://groups.google.com/group/structwsf">structWSF</a> discussion groups for announcements.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" id="commON9" name="commON9"></a> [9] <a rel="nofollow" style="font-weight:bold;" target="_blank" href="http://openstructs.org/">structWSF</a> is a platform-independent Web services framework for accessing and exposing structured RDF data, with generic tools driven by underlying data structures. Its central perspective is that of the dataset. Access and user rights are granted around these datasets, making the framework enterprise-ready and designed for collaboration. Since a <span style="font-weight:bold;">structWSF</span> layer may be placed over virtually any existing datastore with Web access &#8212; including large instance record stores in existing relational databases &#8212; it is also a framework for Web-wide deployments and interoperability.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" name="commON10"></a>[10] <a rel="nofollow" style="font-weight:bold;" target="_blank" href="http://constructscs.com/">conStruct SCS</a> is a structured content system built on the Drupal content management framework. <span style="font-weight:bold;">conStruct</span> enables structured data and its controlling vocabularies (ontologies) to drive applications and user interfaces. It is based on RDF and SD&#8217;s <span style="font-weight:bold;">structWSF</span> platform-independent Web services framework [6]. In addition to user access control and management and a general user interface, <span style="font-weight:bold;">conStruct</span> provides Drupal-level CRUD, data display templating, faceted browsing, full-text search, and import and export over structured data stores based on RDF.</div>
<div style="margin:10px 0pt;font-size:90%;"><a rel="nofollow" name="commON11"></a> [11] More Web services are being added to <span style="font-weight:bold;">structWSF</span> on a fairly constant basis, and the existng ones have been through a number of upgrades.</div>
<img src="http://feeds.feedburner.com/~r/AI3_AdaptiveInformation/~4/47ERL9M6Je8" height="1" width="1"/>]]></content:encoded>
      </item>
      <item>
         <title>Simple semi-structured data entry</title>
         <link>http://www.snee.com/bobdc.blog/2009/11/simple-semi-structure-data-ent.html</link>
         <description>With RDF.</description>
         <guid isPermaLink="false">http://www.snee.com/bobdc.blog/2009/11/simple-semi-structure-data-ent.html</guid>
         <pubDate>Wed, 11 Nov 2009 17:48:55 -0800</pubDate>
         <category>RDF</category>
      </item>
      <item>
         <title>Microsoft's Bing becomes Wolfram|Alpha API Customer</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/s6OdS3vuMj4/industry-news-microsofts-bing-becomes-wolframalpha-api-customer.html</link>
         <description>&lt;p&gt;Microsoft and Wolfram|Alpha both announced today (via their blogs)&amp;nbsp;that Microsoft's Bing decision engine is one of the first Wolfram|Alpha Webservice API customers.&lt;br /&gt;
&amp;nbsp;&lt;br /&gt;
From Wolfram's site...&quot;Starting today, Wolfram|Alpha's knowledge, computed from expertly curated data, will enrich Bing's results in select areas across nutrition, health, and advanced mathematics.&quot;&amp;nbsp; The full post can be found here:&lt;/p&gt;
&lt;p&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://url.wolfram.com/83xxl-./&quot;&gt;Wolfram Blog Post&lt;br /&gt;
&lt;/a&gt;&lt;/p&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=s6OdS3vuMj4:K0tVP94Ptno:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=s6OdS3vuMj4:K0tVP94Ptno:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=s6OdS3vuMj4:K0tVP94Ptno:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/s6OdS3vuMj4&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4682 at http://www.semanticuniverse.com</guid>
         <pubDate>Wed, 11 Nov 2009 11:59:54 -0800</pubDate>
      </item>
      <item>
         <title>RDF Geography With Virtuoso</title>
         <link>http://www.openlinksw.com/weblog/oerling/?id=1587</link>
         <description>&lt;p&gt;We have just added a geometry &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/Data&quot; id=&quot;link-id0x1c4085f8&quot;&gt;data&lt;/a&gt; type and corresponding &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/R-tree&quot; id=&quot;link-id0x1c2ea830&quot;&gt;R&lt;/a&gt;-tree index to &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://virtuoso.openlinksw.com&quot; id=&quot;link-id0x201556b0&quot;&gt;Virtuoso&lt;/a&gt;. This follows the general scheme of &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/SQL&quot; id=&quot;link-id0x20152fc0&quot;&gt;SQL&lt;/a&gt;/MM, as is implemented by &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/PostGIS&quot; id=&quot;link-id0x1c1a7610&quot;&gt;PostGIS&lt;/a&gt; and many others. We have all the engine-side stuff, including optimizer support for geometry cardinality sampling and good execution plans for combinations of spatial and other joins. We have however not yet implemented all the different geometry types and library function support for them, like shortest distance between two arbitrary shapes.&lt;/p&gt; &lt;p&gt;The geometry support is for both SQL and &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/SPARQL&quot; id=&quot;link-id0x1c0fe6f8&quot;&gt;SPARQL&lt;/a&gt;. On the SQL side, it works with the ISO/IEC 13249 SQL/MM API; with &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/Resource_Description_Framework&quot; id=&quot;link-id0x20d637e8&quot;&gt;RDF&lt;/a&gt;, a geometry can occur as the object of a quad. If the object is a typed-literal of the &lt;code&gt;virtrdf:Geometry&lt;/code&gt; type, it gets indexed in a geometry index over all geometries in quads; no special declarations are needed. After this, SQL MM predicates and functions can be used with SPARQL, like this:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt; PREFIX geo: &amp;lt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/Hypertext_Transfer_Protocol&quot; id=&quot;link-id0x1c4f4b50&quot;&gt;http&lt;/a&gt;://www.w3.org/2003/01/geo/wgs84_pos#&amp;gt; SELECT ?class COUNT (*) WHERE { ?m geo:geometry ?geo . ?m a ?class . FILTER ( &amp;lt;bif:st_intersects&amp;gt; ( ?geo, &amp;lt;bif:st_point&amp;gt; (0, 52), 100 ) ) } GROUP BY ?class ORDER BY DESC 2 &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This returns the counts of objects of each class occurring within 100 km of (0, 52), a point near London.&lt;/p&gt; &lt;p&gt;For any data set with &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/World_Geodetic_System&quot; id=&quot;link-id0x1fa3a1d0&quot;&gt;WGS 84&lt;/a&gt; &lt;code&gt;geo:long&lt;/code&gt; and &lt;code&gt;geo:lat&lt;/code&gt; values, a simple SQL function makes a point geometry for each such coordinate pair and adds it as the &lt;code&gt;geo:geometry&lt;/code&gt; property of the subject with the long/lat. This then enables fast spatial access to arbitrary location data in RDF.&lt;/p&gt; &lt;p&gt;Right now, we hardly see any geometries other than points in RDF data, even though there are some efforts for vocabularies for more complex entities. As these get adopted we will support them.&lt;/p&gt; &lt;p&gt;For scalability, we tried the implementation with &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.openstreetmap.org/&quot; id=&quot;link-id0x1c4207d0&quot;&gt;OpenStreetMap&lt;/a&gt;'s 350 million or so points. The geometry implementation partitions well over a cluster, similarly to a full text index, i.e., every server has its slice of the geometries, partitioned by the geometry object's key, thus not by range of coordinates or such. Like this, the items are evenly spread even though the coordinate distribution is highly uneven.&lt;/p&gt; &lt;p&gt;We can do spatial joins like —&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt; SELECT ?s ( &amp;lt;sql:num_or_null&amp;gt; (?p) ) COUNT (*) WHERE { ?s &amp;lt;http://&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/DBpedia&quot; id=&quot;link-id0x10da9b08&quot;&gt;dbpedia&lt;/a&gt;.org/ontology/populationTotal&amp;gt; ?p . FILTER ( &amp;lt;sql:num_or_null&amp;gt; (?p) &amp;gt; 1000000 ) . ?s geo:geometry ?geo . FILTER ( &amp;lt;bif:st_intersects&amp;gt; ( ?pt, ?geo, 5 ) ) . ?xx geo:geometry ?pt } GROUP BY ?s ( &amp;lt;sql:num_or_null&amp;gt; (?p) ) ORDER BY DESC 3 LIMIT 20 &lt;/code&gt; &lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This takes the DBpedia subjects that have a population over 1 million and a geometry. We then count all the geometries within 5 km of the point location of the first geometry. With DBpedia (about 5 million points), &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.geonames.org/&quot; id=&quot;link-id0x21af78d0&quot;&gt;GeoNames&lt;/a&gt; (7 million points), and OpenStreetMap (350 million points), we get the result:&lt;/p&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;code&gt;http://dbpedia.org/resource/Munich 1356594 117280
http://dbpedia.org/resource/London 7355400 81486
http://dbpedia.org/resource/Davao_City 1363337 58640
http://dbpedia.org/resource/Belo_Horizonte 2412937 58640
http://dbpedia.org/resource/Chengde 3610000 58640
http://dbpedia.org/resource/Hamburg 1769117 51664
http://dbpedia.org/resource/San_Diego%2C_California 1266731 47685
http://dbpedia.org/resource/Bursa 1562828 47685
http://dbpedia.org/resource/Port-au-Prince 1082800 47685
http://dbpedia.org/resource/Oakland_County%2C_Michigan 1194156 45636
http://dbpedia.org/resource/Sana%27a 1747627 40923
http://dbpedia.org/resource/Milan 1303437 40923
http://dbpedia.org/resource/Campinas 1059420 40923
http://dbpedia.org/resource/Hohhot 2580000 40923
http://dbpedia.org/resource/Brussels 1031215 40923
http://dbpedia.org/resource/Bogra_District 2988567 40923
http://dbpedia.org/resource/Cort%C3%A9s_Department 1202510 40923
http://dbpedia.org/resource/Berlin 3416300 35668
http://dbpedia.org/resource/New_York_City 8274527 30810
http://dbpedia.org/resource/Los_Angeles%2C_California 3849378 25614&lt;br /&gt;
20 Rows. -- 1733 msec.&lt;br /&gt;
Cluster 8 nodes, 1 s. 358 m/s 1596 KB/s 664% &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/Central_processing_unit&quot; id=&quot;link-id0x1c4406a0&quot;&gt;cpu&lt;/a&gt; 2% read 16% clw threads 1r 0w 0i buffers 1124351 0 d 0 w 0 pfs
&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt; &lt;p&gt;This takes 1.7 seconds on a Virtuoso Cluster configured with 8 processes on a single dual-Xeon 5520 box, running at about 664% CPU with warm &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://dbpedia.org/resource/Cache&quot; id=&quot;link-id0x1c420158&quot;&gt;cache&lt;/a&gt;. Fair enough for a first crack, this can obviously be optimized further. Still, the geo part of the processing is already as good as instantaneous.&lt;/p&gt; &lt;p&gt;We will shortly have the geography features installed on DBpedia and the other data sets we host. As these come online we will show more demo queries.&lt;/p&gt; &lt;p&gt;For more about SQL/MM, you can look to a couple of PDFs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.fer.hr/_download/repository/SQLMM_Spatial-_The_Standard_to_Manage_Spatial_Data_in_Relational_Database_Systems.pdf&quot; id=&quot;link-id133775f0&quot;&gt;SQL/MM Spatial: The Standard to Manage Spatial Data in
Relational Database Systems&lt;/a&gt; by Knut Stolze&lt;/li&gt;
&lt;li&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.sigmod.org/record/issues/0112/standards.pdf&quot; id=&quot;link-id1433c5e0&quot;&gt;SQL Multimedia and Application Packages (SQL/MM)&lt;/a&gt; by Jim Melton and Andrew Eisenberg&lt;/li&gt;
&lt;/ul&gt;</description>
         <author>Orri Erling</author>
         <guid isPermaLink="false">http://www.openlinksw.com/weblog/oerling/?id=1587</guid>
         <pubDate>Wed, 11 Nov 2009 09:17:27 -0800</pubDate>
      </item>
      <item>
         <title>CFP: JWS special issue on semantic search</title>
         <link>http://ebiquity.umbc.edu/blogger/2009/11/11/cfp-jws-special-issue-on-semantic-search/</link>
         <description>Yong Yu and Rudi Studer are editing a special issue of the Journal of Web Semantics on semantic search that will appear in the summer 2010. The special issue will cover interdisciplinary topics between Semantic Web and search. See the call for papers for a list of relevant topics and details on how to [...]</description>
         <guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=2664</guid>
         <pubDate>Wed, 11 Nov 2009 06:21:59 -0800</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" target="_blank" href="http://apex.sjtu.edu.cn/apex_wiki/yyu">Yong Yu</a> and <a rel="nofollow" target="_blank" href="http://www.aifb.uni-karlsruhe.de/Staff/Personen/viewPerson?id_db=57">Rudi Studer</a> are editing a special issue of the <a rel="nofollow" target="_blank" href="http://ees.elsevier.com/jws/">Journal of Web Semantics</a> on <i>semantic search</i> that will appear in the summer 2010. The special issue will cover interdisciplinary topics between Semantic Web and search. See the <a rel="nofollow" target="_blank" href="http://journalofwebsemantics.blogspot.com/2009/11/jws-special-issue-on-semantic-search.html">call for papers</a> for a list of relevant topics and details on how to submit papers, which are due by 20 January 2010</p>]]></content:encoded>
      </item>
      <item>
         <title>Google VP on semantic search and the Semantic Web</title>
         <link>http://ebiquity.umbc.edu/blogger/2009/11/11/google-vp-on-semantic-search-and-the-semantic-web/</link>
         <description>PCWorld has a story, Google VP Mayer Describes the Perfect Search Engine, with some interesting comments on semantic search from Marissa Mayer, Google&amp;#8217;s vice president of Search Products &amp;#038; User Experience. &amp;#8220;IDGNS: What&amp;#8217;s the status of semantic search at Google? You have said in the past that through &amp;#8220;brute force&amp;#8221; &amp;#8212; analyzing massive amounts of queries [...]</description>
         <guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=2660</guid>
         <pubDate>Wed, 11 Nov 2009 06:00:23 -0800</pubDate>
         <content:encoded><![CDATA[<p>PCWorld has a story, <a rel="nofollow" target="_blank" href="http://www.pcworld.com/businesscenter/article/181874/google_vp_mayer_describes_the_perfect_search_engine.html">Google VP Mayer Describes the Perfect Search Engine</a>, with some interesting comments on <i>semantic search</i> from Marissa Mayer, Google&#8217;s vice president of Search Products &#038; User Experience.</p>
<blockquote><p>
&#8220;IDGNS: What&#8217;s the status of semantic search at Google? You have said in the past that through &#8220;brute force&#8221; &#8212; analyzing massive amounts of queries and Web content &#8212; Google&#8217;s engine can deliver results that make it seem as if it understood things semantically, when it really functions using other algorithmic approaches. Is that still the preferred approach?</p>
<p>Mayer: We believe in building intelligent systems that learn off of data in an automated way, [and then] tuning and refining them. When people talk about semantic search and the semantic Web, they usually mean something that is very manual, with maps of various associations between words and things like that. We think you can get to a much better level of understanding through pattern-matching data, building large-scale systems. That&#8217;s how the brain works. That&#8217;s why you have all these fuzzy connections, because the brain is constantly processing lots and lots of data all the time.</p>
<p>IDGNS: A couple of years ago or so, some experts were predicting that semantic technology would revolutionize search and blindside Google, but that hasn&#8217;t happened. It seems that semantic search efforts have hit a wall, especially because semantic engines are hard to scale.</p>
<p>Mayer: The problem is that language changes. Web pages change. How people express themselves changes. And all those things matter in terms of how well semantic search applies. That&#8217;s why it&#8217;s better to have an approach that&#8217;s based on machine learning and that changes, iterates and responds to the data. That&#8217;s a more robust approach. That&#8217;s not to say that semantic search has no part in search. It&#8217;s just that for us, we really prefer to focus on things that can scale. If we could come up with a semantic search solution that could scale, we would be very excited about that. For now, what we&#8217;re seeing is that a lot of our methods approximate the intelligence of semantic search but do it through other means.&#8221; </p></blockquote>
<p>I interpret these comments to mean that Google&#8217;s management still views the concept of semantic search (and the Semantic Web) as involving better understanding of the intended meaning of text in documents and queries. The W3C&#8217;s <i>web of data</i> model is still not on their radar.</p>]]></content:encoded>
      </item>
      <item>
         <title>Sweet Tools Shatters the Sound Barrier</title>
         <link>http://feedproxy.google.com/~r/AI3_AdaptiveInformation/~3/UyvETygl-W4/</link>
         <description>&lt;span class=&quot;Z3988&quot; title=&quot;ctx_ver=Z39.88-2004&amp;amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;amp;rft.title=Sweet Tools Shatters the Sound Barrier&amp;amp;rft.aulast=Bergman&amp;amp;rft.aufirst=Mike&amp;amp;rft.subject=Open Source&amp;amp;rft.subject=Semantic Web Tools&amp;amp;rft.subject=Structured Web&amp;amp;rft.source=AI3:::Adaptive Information&amp;amp;rft.date=2009-11-10&amp;amp;rft.type=blogPost&amp;amp;rft.format=text&amp;amp;rft.identifier=http://www.mkbergman.com/844/sweet-tools-shatters-the-sound-barrier/&amp;amp;rft.language=English&quot;&gt;&lt;/span&gt;New Release Expands to 810 Tools; Gets Major Structured Data Update
It has been eight months since the last major update to Sweet Tools, AI3&amp;#8217;s listing of semantic Web and -related tools. With today&amp;#8217;s release, there are now a total [...]</description>
         <guid isPermaLink="false">http://www.mkbergman.com/?p=844</guid>
         <pubDate>Tue, 10 Nov 2009 14:35:08 -0800</pubDate>
         <content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Sweet Tools Shatters the Sound Barrier&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Open Source&amp;rft.subject=Semantic Web Tools&amp;rft.subject=Structured Web&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2009-11-10&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/844/sweet-tools-shatters-the-sound-barrier/&amp;rft.language=English"></span>
<p><a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Mach_number"><img style="border:0px solid;width:380px;height:271px;float:left;margin-right:10px;" title="Sweet Tools breaks sound barrier" src="http://upload.wikimedia.org/wikipedia/commons/d/d0/FA-18_Hornet_breaking_sound_barrier_%287_July_1999%29.jpg" alt="Sweet Tools breaks sound barrier" hspace="5" vspace="5" align="left"/></a></p>
<h2>New Release Expands to 810 Tools; Gets Major Structured Data Update</h2>
<p>It has been eight months since the last major update to <span style="color:#993300;"><strong><a rel="nofollow">Sweet Tools</a></strong></span>, <span style="color:maroon;"><strong>AI3</strong></span>&#8217;s listing of semantic Web and -related tools. With today&#8217;s release, there are now a total of <span style="font-weight:bold;">810 tools listed</span>, crashing through the <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Sound_barrier">sound barrier</a> of 761 tools. With the retirement of 19 prior tools, this new listing represents an increase of 93 tools, or 13%, from the previous version that listed 736.</p>
<p>But simply adding to the tools listing is not the cause of this longer than normal period between updates.</p>
<p>This little <span style="color:#993300;"><strong><a rel="nofollow">Sweet Tools</a></strong></span> dataset is now showing the way to a couple of exciting innovations: new generic <span style="font-style:italic;">ontology-driven applications</span> for structured data; and, tools for authoring structured data via spreadsheets.</p>
<p>We deal with the former in this post. I will deal with the spreadsheet business in a subsequent post.</p>
<h3>Summary of Major Changes</h3>
<p>So, here is the summary of major changes in this new listing:</p>
<ul>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/"><img style="border:1px solid #990000;margin-left:10px;width:260px;height:163px;margin-top:5px;margin-bottom:5px;float:right;" title="Sweet Tools conStruct Structured View" alt="Sweet Tools conStruct Structured View" vspace="5" width="987" height="617"/></a>A completely new structured data view of the listing, courtesy of <a rel="nofollow" target="_blank" href="http://structureddynamics.com/">Structured Dynamics</a>&#8216; <a rel="nofollow" target="_blank" href="http://openstructs.org/structwsf">structWSF</a> and <a rel="nofollow" target="_blank" href="http://constructscs.com/">conStruct</a> open source frameworks. This version can be <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/">viewed on the conStruct SCS Web site</a> (pick the Sweet Tools dataset). You can compare this server-side presentation and version to the client-side JavaScript <a rel="nofollow">version using Exhibit</a> that has been part of this blog for some time</li>
<li>A new structural organization of the tools into an ontology that relates portions of the <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/ACM_Computing_Classification_System">ACM classification</a> and <a rel="nofollow" target="_blank" href="http://umbel.org/">UMBEL</a> to the tools categories. This provides richer retrievals and inspections on the conStruct version (the Exhibit version remains fairly &#8220;flat&#8221; in structure)</li>
<li>In light of the above, refined tools classifications, and, of course,</li>
<li>The increase in coverage to 810 tools.</li>
</ul>
<p>To see the major <span style="color:#993300;"><strong><a rel="nofollow">Sweet Tools</a></strong></span> page for this updated listing in its existing format, filter on ‘New’ under <strong>New or Existing?</strong> to see the recent additions. Alternatively, you can also see this same filtering using the conStruct structured data view by searching on the Status attribute using the value &#8216;New&#8217;; see example <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/search/?filter_attributes_4=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23status&amp;query=new&amp;filter=on"> here</a>.</p>
<div class="boxRedSolid" style="margin:0px 10px 5px 0pt;width:260px;float:left;font-size:110%;font-style:italic;text-align:center;">See the new Sweet Tools structured data display at <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/">conStruct</a>!</div>
<h3>Structured Data via conStruct</h3>
<p>Though still formative, the most exciting change with the <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/"><span style="color:#993300;"><strong>Sweet Tools</strong></span></a> listing is this new presentation via conStruct. It is a structured data Web services framework with a UI, all offered as a set of modules to Drupal. To kick the tires with this new system, you may want to look at:</p>
<ul>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/browse/">Browsing the ontology tree</a> (then, Browse by Kind)</li>
<li>Viewing an <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Fswt%2Firon&amp;dataset=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F182%2F"> instance record</a></li>
<li>Viewing a <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/ontology/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23KRBrowser"> Class Type Report</a></li>
<li>Viewing an <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/ontology/view/?uri=http%3A%2F%2Fpurl.org%2Fontology%2Firon%23description"> Attribute Report</a></li>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/search/?filter_types_3=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23KRBrowser&amp;filter_attributes_4=http%3A%2F%2Fpurl.org%2Fontology%2Fcosmo%23status&amp;query=new&amp;filter=on"> Searching by facet</a> (check the tabs)</li>
<li>Doing a <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/search/">multi-value filtering</a> (make selections from the various tabs)</li>
<li> <a rel="nofollow" target="_blank" href="http://constructscs.com/conStruct/export/">Exporting stuff</a> in a variety of formats.</li>
</ul>
<p>BTW, there are some helpful <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions">documentation pages</a> that show how all of these various tools work and more, such as, for example, <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions/browse">Browse</a>. (Also, BTW, as a <span style="font-style:italic;">demo</span> user, you also are not seeing all of the write and update tools, either; again, see the <a rel="nofollow" target="_blank" href="http://constructscs.com/documentation/instructions">documentation</a>.)</p>
<p>The essential underlying basis to conStruct is the <a rel="nofollow" target="_blank" href="http://openstructs.org/structwsf">structWSF</a> Web services framework. There are still some aspects to this system that we feel are incomplete and we are working on. Some of these things include dropdown selections (controlled vocabulary selects); easier template creation; and intuitive template re-use. Nonetheless, these additions will come quickly, and what is here is already a great demonstration of how structured data can drive generic tools and interfaces.</p>
<p>As I said: More on this in a <a rel="nofollow" target="_blank" href="http://www.mkbergman.com/845/a-most-un-common-way-to-author-datasets/">later post</a>.</p>
<h3>Updated Statistics</h3>
<p>The updated <span style="color:#993300;"><strong><a rel="nofollow">Sweet Tools</a></strong></span> listing now includes nearly 50 different tools categories. The most prevalent categories are browser tools (RDF, OWL), information extraction, parsers or converters, composite application frameworks and general ontology tools. Each accounts for more than 8% &#8212; or more than 50 tools &#8212; of the total. This breakdown is as follows (click to expand):</p>
<div style="margin:10px 0px;"><a rel="nofollow"> <img class="center_ok" style="border:0px solid;width:600px;height:388px;" title="Click to expand" alt="Sweet Tools Applications"/></a>There are no real discernable trends in application tool categories over the past couple of years.</div>
<p>As for the languages these applications are written in, that has stayed pretty steady, too. Java is still the leading language at about 46%, which has been very slightly trending downward over the past three years or so. PHP has increased a bit as well. The current splits are (click to expand):</p>
<div style="margin:10px 0px;"><a rel="nofollow"> <img class="center_ok" style="border:0px solid;width:460px;height:400px;" title="Click to expand" alt="Sweet Tools Languages"/></a></div>
<h3>Prior Updates</h3>
<p>Background on prior listings and earlier statistics may be found on these previous posts:</p>
<ul>
<li> <a rel="nofollow">Sweet Tools Updated to 736 Tools</a> (February 2, 2009)</li>
<li> <a rel="nofollow" title="Permanent Link to Sweet Tools Listing Now Exceeds 700 Tools">Sweet Tools Listing Now Exceeds 700 Tools</a> (July 5, 2008)</li>
<li> <a rel="nofollow" title="Permanent Link to Sweet Tools Updated, Opened for Collaboration">Sweet Tools Updated, Opened for Collaboration</a> (Mar. 31, 2008)</li>
<li> <a rel="nofollow">Sweet Tools Updated to 650 Tools</a> (Nov. 18, 2007)</li>
<li> <a rel="nofollow"> New Release: 578 Semantic Web and -related Tools</a> (Sept. 16, 2007)</li>
<li> <a rel="nofollow">542 Semantic Web and -related Tools</a> (Jun. 19, 2007)</li>
<li> <a rel="nofollow" title="Listing of 500 Semantic Web and Related Tools">Listing of 500 Semantic Web and Related Tools</a> (Mar. 11, 2007)</li>
<li> <a rel="nofollow" title="Sweet Tools Updated to 420 Tools">Sweet Tools Updated to 420 Tools</a> (Feb. 7, 2007)</li>
<li> <a rel="nofollow" title="Converting 'Sweet Tools' to an Exhibit">Converting &#8216;Sweet Tools&#8217; to an Exhibit</a> (Jan. 22, 2007)</li>
<li> <a rel="nofollow" title="Permanent Sweet Tools Listing -- 420+ Tools and Counting!">Permanent Sweet Tools Listing — 400+ Tools and Counting!</a> (Jan. 5, 2007)</li>
<li> <a rel="nofollow" title="Comprehensive Listing of 250 Semantic Web Tools (updated)">Comprehensive Listing of 250 Semantic Web Tools (updated)</a> (Oct. 4, 2006)</li>
<li> <a rel="nofollow" title="Comprehensive Listing of 175 Semantic Web Tools">Comprehensive Listing of 175 Semantic Web Tools</a> (Sep. 22, 2006)</li>
<li> <a rel="nofollow" title="Current Listing of 70 Semantic Web Tools">Current Listing of 70 Semantic Web Tools</a> (Aug. 12, 2006)</li>
</ul>
<p>With interim updates periodically over that period.</p>
<p><strong>Note:</strong> Because of comments expirations on prior posts, <strong>this entry is now the new location for adding a suggested new tool</strong>. Simply provide your information in the comments section, and your tool will be included in the next update.</p>
<img src="http://feeds.feedburner.com/~r/AI3_AdaptiveInformation/~4/UyvETygl-W4" height="1" width="1"/>]]></content:encoded>
      </item>
      <item>
         <title>Google VP Mayer Describes the Perfect Search Engine - PC World</title>
         <link>http://feedproxy.google.com/~r/SemanticUniverse/~3/JZ5Zt_5O21M/industry-news-google-vp-mayer-describes-perfect-search-engine-pc-world.html</link>
         <description>&lt;table border=&quot;0&quot; cellpadding=&quot;2&quot; cellspacing=&quot;7&quot;&gt;&lt;tr&gt;&lt;td width=&quot;80&quot; align=&quot;center&quot; valign=&quot;top&quot;&gt;&lt;font&gt;&lt;/font&gt;&lt;/td&gt;
&lt;td valign=&quot;top&quot; class=&quot;j&quot;&gt;&lt;font&gt;
&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class=&quot;feedflare&quot;&gt;
&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=JZ5Zt_5O21M:xhV403jHK8s:yIl2AUoC8zA&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?d=yIl2AUoC8zA&quot; border=&quot;0&quot;&gt;&lt;/a&gt; &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?a=JZ5Zt_5O21M:xhV403jHK8s:V_sGLiPBpWU&quot;&gt;&lt;img src=&quot;http://feeds.feedburner.com/~ff/SemanticUniverse?i=JZ5Zt_5O21M:xhV403jHK8s:V_sGLiPBpWU&quot; border=&quot;0&quot;&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src=&quot;http://feeds.feedburner.com/~r/SemanticUniverse/~4/JZ5Zt_5O21M&quot; height=&quot;1&quot; width=&quot;1&quot;/&gt;</description>
         <guid isPermaLink="false">4681 at http://www.semanticuniverse.com</guid>
         <pubDate>Tue, 10 Nov 2009 14:00:40 -0800</pubDate>
      </item>
      <item>
         <title>Risky Business: It Doesn’t Have to Be That Way</title>
         <link>http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp?c=rss</link>
         <description>&lt;p&gt;&lt;strong&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/Jennifer-Zaino-profile.html&quot;&gt;Jennifer Zaino&lt;/a&gt;&lt;/strong&gt; &lt;br /&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;&lt;/p&gt; &lt;p&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://bookofodds.com/&quot;&gt;The Book of Odds&lt;/a&gt; web site offers a forum for determining the odds of everyday life (see previous &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/what_are_the_odds_this_semanticpowered_site_tells_you_140435.asp&quot;&gt;article&lt;/a&gt;), letting users combine odds statements in unexpected ways in real time to compare, for example, whether there’s a greater risk of dying from an encounter with a shark or a vending machine (it’s the latter). &lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;cambridgesemanticslogo.jpg&quot; src=&quot;http://www.semanticweb.com/original/cambridgesemanticslogo.jpg&quot; width=&quot;133&quot; height=&quot;74&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;It’s interesting that the site is powered by &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/making_data_silos_efficient_with_semantics_139074.asp&quot;&gt;Cambridge Semantics&lt;/a&gt; technology, because that vendor sees its business customers increasingly eager to take advantage of its tools for combining, using and sharing data from disparate sources - regardless of variations in data structure - to deal with their business risks. &lt;/p&gt; &lt;p&gt;Along with growing their toplines and controlling costs, risk is one of the most prominent issues Cambridge customers want to address, says CEO Michael Cataldo. &lt;/p&gt; &lt;p&gt;“And the biggest risk is associated with questions you just don’t know and so you can’t answer,” he says. “One of the reasons I think semantics is going to be so hot is that it puts the capability of answering those kinds of questions in the hands of the end users.”&lt;/p&gt; &lt;p class=&quot;continued&quot;&gt;&lt;a rel=&quot;nofollow&quot; class=&quot;continued&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp#more&quot;&gt;continued...&lt;/a&gt;&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp?c=rss</guid>
         <pubDate>Tue, 10 Nov 2009 10:34:57 -0800</pubDate>
         <category>Features</category>
         <enclosure length="2895" url="http://www.semanticweb.com/original/cambridgesemanticslogo.jpg" type="image/jpeg"/>
      </item>
      <item>
         <title>Risky Business: It Doesn’t Have to Be That Way</title>
         <link>http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp?c=rss</link>
         <description>&lt;p&gt;&lt;strong&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/Jennifer-Zaino-profile.html&quot;&gt;Jennifer Zaino&lt;/a&gt;&lt;/strong&gt; &lt;br /&gt;
&lt;em&gt;SemanticWeb.com Contributor&lt;/em&gt;&lt;/p&gt; &lt;p&gt;&lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://bookofodds.com/&quot;&gt;The Book of Odds&lt;/a&gt; web site offers a forum for determining the odds of everyday life (see previous &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/what_are_the_odds_this_semanticpowered_site_tells_you_140435.asp&quot;&gt;article&lt;/a&gt;), letting users combine odds statements in unexpected ways in real time to compare, for example, whether there’s a greater risk of dying from an encounter with a shark or a vending machine (it’s the latter). &lt;/p&gt; &lt;p&gt;&lt;img alt=&quot;cambridgesemanticslogo.jpg&quot; src=&quot;http://www.semanticweb.com/original/cambridgesemanticslogo.jpg&quot; width=&quot;133&quot; height=&quot;74&quot; align=&quot;right&quot; vspace=&quot;6&quot; hspace=&quot;3&quot;/&gt;It’s interesting that the site is powered by &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/making_data_silos_efficient_with_semantics_139074.asp&quot;&gt;Cambridge Semantics&lt;/a&gt; technology, because that vendor sees its business customers increasingly eager to take advantage of its tools for combining, using and sharing data from disparate sources - regardless of variations in data structure - to deal with their business risks. &lt;/p&gt; &lt;p&gt;Along with growing their toplines and controlling costs, risk is one of the most prominent issues Cambridge customers want to address, says CEO Michael Cataldo. &lt;/p&gt; &lt;p&gt;“And the biggest risk is associated with questions you just don’t know and so you can’t answer,” he says. “One of the reasons I think semantics is going to be so hot is that it puts the capability of answering those kinds of questions in the hands of the end users.”&lt;/p&gt; &lt;p class=&quot;continued&quot;&gt;&lt;a rel=&quot;nofollow&quot; class=&quot;continued&quot; target=&quot;_blank&quot; href=&quot;http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp#more&quot;&gt;continued...&lt;/a&gt;&lt;/p&gt; &lt;p&gt;New Career Opportunities Daily: The &lt;a rel=&quot;nofollow&quot; target=&quot;_blank&quot; href=&quot;http://www.mediabistro.com/joblistings/?c=rss&quot;&gt;best jobs in media&lt;/a&gt;. &lt;/p&gt;</description>
         <guid isPermaLink="false">http://www.semanticweb.com/features/risky_business_it_doesnat_have_to_be_that_way_142758.asp?c=rss</guid>
         <pubDate>Tue, 10 Nov 2009 10:34:57 -0800</pubDate>
         <category>Features</category>
         <enclosure length="2895" url="http://www.semanticweb.com/original/cambridgesemanticslogo.jpg" type="image/jpeg"/>
      </item>
   </channel>
</rss>
<!-- fe8.pipes.sp1.yahoo.com uncompressed/chunked Sat Nov 21 13:58:15 PST 2009 -->
