Archive

Archive for the ‘The Developing Future’ Category

Why it’s hard to sell me on the Semantic Web - Part 2

January 29th, 2009

The first post in this series gave a background to the Semantic Web, as traditionally conceived.  This post gives an overview of three of the problems we’re facing in making that vision a reality.  I went longer on these three than I had thought I would.  I’ll delve into the fourth problem - trust - in the next post.

As for the title - let me clarify myself a bit.  I like the Semantic Web vision - it has poetry.  We’ll likely continue to see incrementally closer and closer implementations of it.  What I have a hard time swallowing is the claims (usually of software vendors) that they are delivering it today.  There are just too many tough problems between us and the goal to imagine that it’s all been solved by one software vendor.

If the vision of the Semantic Web is creating a distributed world-wide library of facts that your computer can use to answer all sorts of questions for you - what makes it so hard?   Let’s take a look at three of the major problems.

First creating an ontology is hard. An ontology is an explicit, computer-readable declaration of what exists in the world and how all of those things are related.  If it’s to really include all of the myriad things that people care about, it’s a monstrously complex task.

The task of classifying… all the ideas that seek expression is the most stupendous of logical tasks.  Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty.- Charles Sanders Peirce

One way to tame this beast is to settle for an ontology that only covers the most popular items (e.g. food, travel, popular entertainment and consumer merchandise).  We’ll be able to ask our computers about mass-market things, but anything more unusual (Burmese culture, history of organized crime, vacuum repair) would be outside of the system’s depth.  It looks like Headup, among others, is tackling the problem from this angle.

The second problem - marking up semantic content is hard.  Beyond the very simple cases, creating documents that effectively tell computers interesting facts is a job for experts; it’s not at all as easy as HTML/CSS.  OWL, the language that the W3C has recommended for doing this work, is terrifically complex.  A person needs to breath first order logic in order to use it in any interesting way.  The general public is outclassed on this one.

Some less rigourous and less ardous ways to markup content are showing up (e.g. Microformats).  These provide a simpler syntax for marking up very common items like places and people.   Some companies are also marking up major storehouses of information (like IMDB) by hand in order to provide the core information for the mass market audience.  In either case, the long-tail of human knowledge is left out of the picture.

Even if we were to have a good model of what exists in the world and gobs of documents all marked up beutifully, we’d still have our third problem - the reasoning problem.  It is by no means simple to get a computer to do acts of logic in the wild.  Getting these reasoners rolling to the point where you can ask them a question and have them come back with an answer sometime before the heat-death of the universe is not a simple task.  Some questions are simply not answerable, but these are considered nice.  There are some questions that are not-answerable in such a way that the computer will never know that they are not answerable - those are a bit nastier.  There are all sorts of people working on their doctorates on just small subsets of this problem.

That’s three of the barriers - in the next post I’ll tackle the trust issue.

The Developing Future

Why it’s hard to sell me on the Semantic Web - Part 1

January 27th, 2009

A good friend of mine works as a social media editor.  We periodically get together for long lunches where the free wheeling conversation hits all the topics of note in the current communication scene.  I was surprised today when he brought up the question of the Semantic Web.  After a half-decade stint in the business of semantic technologies, I’ve basically written off the Semantic Web.  After ten years of failed promise, I’m always a bit surprised to hear another rumor of it’s pending existence.

In short - the Semantic Web promises to turn all of the text found on the web into machine readable facts, and to provide programs that can use those facts to answer questions for you.  So, for example, a restaurant website may say “We’re located at 518 Chestnut Street, have a wide variety of sandwiches, and are open on Saturday.”  The website may give a full menu, driving directions, a list of daily specials, etc.  To a computer this looks like just a bunch of text - blah blah blah blah.  A semantically marked up document would put a formal representation of this information in place along with the text.  Very loosely speaking, it would look something like this:

<Organization type=”Restaurant” name=”Bob’s Restaurant” id=”1″/><isLocatedAt/><Address text=”518 Chestnut Street”/>

<Organization id=”1″/><sellsGoods/><Food type=”sandwiches”/>

<Organization id=”1″/><isOpen/><recurringDay=”7″/>

Once beautiful documents like this are in place, you can ask your computer a question like “Where can I get a sandwich on Saturday”, and the computer would come back with my restaurant.  You could even give your computer quite complex tasks and have it come back with good answers -  “I have to pick up toothpaste, a watermelon, and a large camelhair coat, meet with the mayor, my fiancee, and my lawyer, and I want to get a good sandwich around lunchtime.  Please plan out a course of travel and schedule that takes into account expected traffic and the hours of the shops I have to visit. Also, let me know if I’m passing any place that’s having a going-out-of-business sale.”  The computer would hit tens of websites, communicate with other agents, and put together the schedule and information for you.

That’s the dream.  None less than Tim Berners Lee, the father of the web, has been championing this for years.  The seminal article on the topic was published in 2001.

There are a few major roadblocks.  Teaching computers about common sense is hard - that’s the ontology problem.  Creating those beatiful documents above is hard - that’s the markup problem.  Teaching computers to reason through all those facts is hard - that’s the reasoning problem.  The one I’d like to really focus on, though, is the trust problem.  I’ll post on that one in the coming days.

Social Media, The Developing Future

Change Afoot

January 6th, 2009

For the next 10 days or so, the second round of voting is happening at change.org. 90 issues passed the first round, and the top ten vote getters will be presented to the Obama administration on January 16th.

One that gets my attention, but hasn’t yet worked it’s way to the top ten, is Lawrence Lessig’s proposal for publicly funded elections. See his presentation and vote here. It may well be the best idea I’ve ever seen in American politics. Moreso, without this or a similar measure, it’s easy to see America suffering greatly as corruption eats away at the heart of its political system.

The 7 minute presentation is well worth watching, and is, I believe, a cause for hope.

Poltics Unusual, The Developing Future

This is your Brain on New Media

December 18th, 2008

There’s been a firestorm of late about the amount of repetitive stories on RSS, particularly in the technical blogs. Michael Arrington declared open war on embargoes, which touched off an insightful article from Louis Gray. (Thanks to this article from Smoothspan for sending me over.)

Louis writes:

While I look forward to banging through my Google Reader feeds every day, I can pretty much bank on seeing the same story, spun a different way, a good dozen or two dozen times by every single tech blog - even if it’s clear that they are just reporting that someone else reported the news. If you see a story has been covered already and you have nothing to add - leave it alone.

What is most interesting to me here is the personal and societal. We’re the guinea pigs in a new media reality. I would really love to hear a voice as incisive as Marshall McLuhan’s to help me understand what that is doing to my brain. We have here a media that can be treated either as hot or as cold. It is neither entirely overwhelming or intensively participatory. Neither is is somewhere in between - it’s something other than the media we’ve seen up until now. Its character is entirely dependent on the reader.

This media calls to the forefront each person’s ability to choose, and it’s likely for this reason that it’s becoming the arena for a brilliant hashing out of interpersonal ethics - When do I speak and when am I silent? What obligations do I have to the people who listen to me? What obligations do I have to myself when I participate in this? How much responsibility do I bear for the overall state of the media?

Still cooking these ideas…any insight welcome.

Social Media, The Developing Future ,

How Powerful are the People?

November 22nd, 2008

Lawrence Lessig just won me as a new fan. I feel like I can breath better after listening to this interview (below).
(For those of you reading via syndication, click through to the original post to see the video.)

Topics include Professor Lessig’s relationship with Obama, national emergencies, transitional government, trust, the virtues of amateur creativity, hybrid economies, copyright (the entrenched policy, the dangerous reaction, and a more reasonable reform), remix as fair use, Creative Commons, his shift into focusing on corruption as the core underlying problem, the influence of money on politics, how to break the political dependency on money, and getting congress to put their reform chips on the table.

Favorite Quote:
“These are not the hard things that congress are getting wrong; these are the easy things that congress is getting wrong.”

Update: Here’s the powerful presentation on changing congress that he refers to in the video.

Poltics Unusual, Smart Folks, The Developing Future

Better Place Rolling out Electric Car Network in California

November 22nd, 2008

First Israel, then Denmark, some cities in Australia, and now the Bay Area.  Full details aren’t out yet, but this has to be the highest profile coup of Better Place.  More power to ‘em.

http://venturebeat.com/2008/11/20/california-to-set-up-a-1b-electric-car-network/

The Developing Future ,

SimpleDB. Simple? well…. DB? umm…

December 16th, 2007

Amazon has announced it’s next software-as-a-service play - SimpleDB - and the technical world is all a-flutter. You can see the breathless reporting all over the net. Is this the beautiful panacea of unleashed database power it’s being reported as?

A few months back, I examined S3 and EC2 (two of Amazon’s earlier web service offerings) and came away with the sense that Amazon is changing the rules of the game in a big way, but that there is still no good way to implement an online scalable database with these services. (I know that there are attempts to put a relational database on EC2, but they seem to be quite painful.)

In short, S3 is a great scalable online file system, and EC2 is a great scalable processor. Neither one of them allows us the sort of slicing, dicing, remixing, and re-serving of data that the world has come to expect from a database.

So when the news of SimpleDB hit the wires, I figured that Amazon was stepping up to answer my cry and providing an online scalable database. Well, they are, and they aren’t.

They are providing a way to store structured information online, but it’s hard to call it a database. In fact, it’s a bit disingenuous of them to do so.

Those who are concerned about the details can quickly find out for themselves that SimpleDB has nothing to do with the Relational Model that has been the basis for databases for the past 40 years. To call something that doesn’t even smell like the relational model a database is pure marketing.

But let’s leave the marketing aside - the service is in the field, it’s a totally new beast, and it’s called what it’s called. What does it look like? It looks like a place to store object instances. No classes, no schema, they-are-just-what-they-look-like, object instance. Besides coming with a whole new metamodel, it comes without a lot of the sugar that mature database systems have led us to expect. There’s no fulltext search, queries are lexicographic (so they don’t deal well with numbers or dates), the set of operators on a query is more limited than we’re used to, there’s no verification of the data, no triggers, etc.

All this is fun, but the real kicker is this - reading data from SimpleDB immediately after a write may not reflect the latest updates. This is called eventual consistency. That’s what you tell your customers - it’ll get there eventually.

What happens now? What happens now is that the developers start to relearn the way they handle data. No existing database applications - back-end, front-end, or middleware - can be easily ported to run on top of this new beast. You can’t tell your customers that the data will get there eventually, so you tell your developers to cover the gap. This service might save you a database administrator, but in the near term, you’ll need another developer to take his place.

In the long term, we’ll start seeing SimpleDB, S3, and EC2 aggregated under another layer - one that presents the tried and true relational model. SimpleDB will handle the tuples, S3 will handle the BLOBS, EC2 will grind the queries, and the application developer won’t have to worry about it. Whether Amazon delivers it or someone else does, it’s coming - the reliable online scalable database.

(Some of the sharper analyses: O’reilly compares pricing of SimpleDB to S3, Marcelo smells a familiar data model, Inside Looking Out lays out some of the technical hurdles, rc3 wonders how to tune it, and the comment from daveadams sings a love song to the relational model)

The Developing Future , , , ,

Of Content, we have Plenty

December 5th, 2007

Nicholas Carr argues that a key factor in Kindle’s downfall is the lack of an already living and healthy market for free reading material. I share his misgivings about the Kindle, but I don’t agree with his reasoning. Although new books are generally locked up in copyright, there’s no lack of free reading material. That’s what we (as humans) have been busy developing for the past 15 years or so. It’s called the Internet.

And that’s really where Amazon is missing the ball. By locking down their device to access a tiny percentage of the potential content, and making it difficult to get the free content on and off, they’ve painted themselves into a proprietary corner. What they could have done was put a lightweight web browser into our hands - now that would have been fun.

The Developing Future , ,