Structured Blogging Progress and Loop-Inning
[Marc Canter and Bob Wyman sent an email out bringing me into the loop with a number of great folks doing work on the structured blogging concept. Very excited about this. Here's my response to them. A little wordy, but I wanted to cover the bases because I'm not sure anybody knows what it is we do here at Reger.com. Hoping to see many of these folks at the Web 2.0 conference.]
Hi All!
Thanks for bringing me into the loop! Let me share a little about where I'm headed at Reger.com... I think it's right in line with what you guys are doing. I need help making sure that what I'm doing is standards compliant and open... and even makes sense.
With our geeky wares users can create their own "Log Types" via a user interface. They add fields of various types (dropdown, numeric range, text box, radio buttons, etc.), name them and drag them into arrangement on the screen. With their "Log Type" defined, they begin to blog, collecting data. For users things are kept non-XML and non-geeky... they just add fields for things they need to track.
By virtue of having defined a "Log Type", we dynamically generate and publish the underlying XML Schema. An example of such a XML Schema is here:
http://reger.com/about/signup-log-type-detail.log?eventtypeid=1&logtypedetailtoshow=6
When users blog with their special "Log Types" we publish that data in a couple ways. First, we add it to a custom namespace in the RSS feed. This is a fairly simple name/value pair implementation that we came up with. It works but it isn't very extensible. We need help here.
Second, we publish the data inside of the HTML a'la Structured Blogging. View source on this page:
http://joereger.com/entry-logid2-eventid4470.log
You'll see this:
20
Adidas Supernova
0
0
Pavement
10100
Nothing major here, but it is SB and it does shield the user from having to do any code work, allowing them to create ad hoc "Log Types" as they go at the UI level.
Users can build charts and graphs with whatever data they create. They can create saved searches based on data fields. I.e. "show me runs where I went over 10 miles." Or, "show me movies that were rated R and I gave two thumbs up to."
(Sorry for the soap box advertising... I'm not sure how familiar everybody is with what we do at Reger.com.)
So, where from here?
Our next step is to allow users to upload XML Schema files to create "Log Types." When we do this we'll have much more portability. This is where I'd like to get involved with the group. In my mind users will create "Log Types" and email them to all of their friends who import them into whatever blogging tool they're using. They then start blogging, collecting the same data as their friends.
(And, as an added bonus, when we have this I can go back to Jon Udell and say "see, now we've got that." He threw down the challenge earlier in the year.)
In my geeky dreams every person, club, group, family and graduating high school class has their own Log Type. Granular, focused formats that allow people to track what they want to track quickly and easily. Initially this largely means tabular, flat data, but can later be expanded.
You'll likely challenge that it's hard to provide value on top of thousands of formats. What's the point in having all these formats, but no standardization? It's a good question, but one that I'm willing to defer in favor of giving people the freedom to create and track their data as they want to. Bob Wyman's a bright guy and he'll find threads between Log Types :) I don't want to dismiss the question... I don't know what the answer is, but I know that we're doing a hell of a lot on the web with nothing but text strings... any sort of structure will improve things eventually. I'm interested in this vision of a fluid array of formats for a number of reasons.
First, it's democratic and grassroots. Standards can grow organically and gain acceptance without massive investment, argument or organization. This is done by voting mechanisms, usage metrics, etc... each new user who wants to track their Running data will have twenty Running Log Types to choose from and will choose the one that fits them. On the site now you can see who's using each Log Type, how many entries there are using it, etc. Multiply by a million users and there will be a winner... or two... but not likely twenty. If somebody like Bob Wyman only has the budget to support one Running Log format then, well, he can choose one. And by virtue of choosing one he'll give it momentum and power. Then Google can choose another format and they can war it out for a while. As a toolmaking facilitator, I'm somewhat agnostic. An example of usage data on a log type on the left side of this page:
http://reger.com/about/signup-log-type-detail.log?eventtypeid=1&logtypedetailtoshow=1
Second, the demands of consumers with respect to the back-end XML integration aren't as high. I want the initial bar to be low. They aren't interested in enterprise-quality integration... they won't be spending millions to integrate their legacy Raquetball Log into their Raquetball Log 2.0. This means that I/we can work with them to learn how to morph standards across blogging systems, share schemas, etc. Once we figure out the basics we can move on to more enterprise standards which demand a lot of flexibility. Baby steps. Lessons learned. There are a lot of tricky issues, both technical and religious to deal with here.
Ok, so I've shared what we're thinking and hopefully haven't scared you away with this talk of thousands of microformats, log types, xml schemas, etc.
There are a few questions I need help on:
1) Is XML Schema the best way to represent what we call a Log Type (and others call a SB entry, microcontent, etc.)? I think XML Schema is an incredibly powerful and robust way to describe data. It seems to be leading the space and is applicable. You mention some other formats.
2) Do we need a Base Plus approach? By this I mean, do we need to agree to a base set of fields that define an SB entity and then add extended data fields to it? RSS and/or Atom seem to have solved the base part already and may be a good starting point. The reason this is important is on the aggregator level. Bob Wyman is gonna need some basics like a title, datestamp and GUID to display search results. Things he can rely on. If we don't start with some of the basics we may start out trying to support too many different "objects" embedded into a page. Marc might say that the "media" type is a base... you have a base set of fields for an image... a base set of fields for a blog entry... a base set of fields for a review. But from there, in my vision, users run rampant and create whatever tickles their fancy. Aggregators like Bob Wyman get a balance of fixed fields and new, exciting user data.
3) Assuming XML Schema is a good way to go, to what level do we need to support it? How many levels of nested attributes? Which data types? Which constraints? The XML Schema spec is massive and from a toolmaker perspective (my perspective) it could take years to implement it completely. I haven't seen anybody completely implement XML Schema into a UI. Some Cocoon developers are close, but still not complete. I agree that we'd like to get full XML Schema compliance, but in the meantime I think we need to start with something a little more constrained. This is one place that I'll begin to make recommendations as we launch the ability to upload XML Schema files to create log types. You guys will download CDWA, upload it and get something crazy. So you'll tell me I need to support more or less or different.
4) Marc mentioned a compiler that takes "pseudo-code" python stuff and turns it into a number of schema formats. What's the source of the "pseudo-code" python stuff? Where does the format begin?
5) When you say plug-ins for WP, MT and Drupal, what will users be able to blog? In my terminology, what Log Types will these plug-ins create? Or will they allow users to define their own Log Types? If so, can I get a head start on the sharing of Log Types with those systems?
Thanks again for getting me involved. I'm excited to play a role but honestly I'm at a bit of a loss as to how to do so. Please don't hesitate to point me in the right direction, tell me what you'd like to see, etc.
And I promise I won't be as wordy next time.
Best,
Joe Reger
P.S. Posting to the blog as well...