XSD for Atom
August 7, 2008 on 8:30 pm | In Atom | No CommentsAnother question I’ve fielded several times over the last weeks: Is there any XSD for Atom?
My first response: “What are you using the schema for?” I won’t go into all of the details of that dialog, but remember that you should always carefully consider the use of a schema before building one.
So to the point…
The Atom format spec provides a Relax-NG schema, to provide a normative description of the atom syntax. It is, however, not enough to validate XML against this schema as Relax-NG is not expressive enough to capture some of the other validity requirements for atom feeds or entries - namely the requirements listed in sections 4.1.1 and 4.1.2, respectively. XML Schema is also not expressive enough, in fact, I don’t believe any schema language is and that is why we have the feed validator (also available here - it’s opensource). That being said, a schema can be used for more than validation, and quite a lot of tooling (e.g. web services tooling) leverages XSD. Hence the question.
I did a bit of digging and found what I think is the most recent (June) Atom xsd thread from the mailing lists. In particular this thread showed up in the Abdera user mailing list - Abdera is an opensource implementation of both RFC 4287 and RFC 5023 as well as several extensions, both approved and internet drafts. It is currently in incubation with Apache.
In this particular thread Danny Ayers suggested Trang for conversion from the Relax-NG schema to xsd. I gave it a shot and while I haven’t used the resulting xsd for anything I did find that another internal project here at EMC using Atom did the same and are using that xsd.
There doesn’t seem to be a defacto standard way of obtaining an xsd for the Atom format - this mechanism seems as good as any.
Atompub collections and resource collections
July 11, 2008 on 2:08 pm | In ResourceOrientation, Atom | No CommentsThere is a subtle but important difference in the use of the term “collection” in the context of resource oriented architectures and Atom standards. I’ve literally had this conversation 3 or 4 times with colleagues in the last week or two so thought it relevant to post.
At the crux of the matter is the fact that the term “collection” is used in both the Atom context and resource oriented context. The good news is that the thing each is referring to with the term “collection” is almost the same as what is referred to in the other - the bad news is that the things they are referring to are almost the same. It is the “almost” that I want to address here. Let me start out with the good news “almost.”
In resource oriented interfaces, i.e. REST interfaces, the term collection is colloquially used to refer to sets of resources. Examples of resource collections are all of my blog entries, all of my photos on Flickr, all of my snowboarding photos on Flickr or all of the blog entries from some source that are about Atom. Pretty simple concept. The Atom Syndication Format has a construct, a feed, that can be used to represent a resource collection. The Atom format doesn’t formally define a collection (it formally defines a feed), nor does any resource-oriented literature that I am aware of.
The term “collection” IS formally defined in Atompub.
The Atom Protocol defines Collection Resources for managing and organizing both kinds of Member Resource. A Collection is represented by an Atom Feed Document. A Collection Feed’s Entries contain the IRIs of, and metadata about, the Collection’s Member Resources. A Collection Feed can contain any number of Entries, which might represent all the Members of the Collection, or an ordered subset of them.
Okay, so far, so good - seems not to have any conflict with the less formal notion of a collection in resource-oriented architectures. But later on in the spec is where the difference comes in. Atompub defines a new element, app:edited, which is the date that the entry was most recently changed, and it is the semantics around this that introduce a subtle difference between the resource collection and an Atompub collection.
The Entries in the returned Atom Feed SHOULD be ordered by their “app:edited” property, with the most recently edited Entries coming first in the document order.
“SHOULD” is defined in RFC2119 and should be read as “your compliant app really, really, really should abide by this but if it doesn’t, you better have a really, really, really good reason for any deviation.” What that means is that Atompub collections are to be ordered by the value of the app:edited value. I know, I know, there are plenty of instances where we want a feed that is ordered by something other than a last edited date - contacts or even documents alphabetically, calendar entries by the appointment date/time (not by when the appointment was created or updated), etc. Such sets of entries with such orderings can be safely represented as Atom feeds but they are NOT Atompub collections.
Why did Atompub add this constraint? Because Atompub is not only about the publishing or writing of content, it is equally about synchronization. Off line access. Take a feed reader for example. Many of them have off-line support so that I can catch up on the latest even when I am on an airplane. If the server producing the feed is Atompub compliant and you have subscribed to feeds that are in fact collections, whenever you do come on line the reader can request a refresh on a feed and be sure to get the latest changes without having to retrieve the entire feed.
Often when we have overloaded terms, like “collection”, the context of use makes it very clear which meaning is intended. In the case of resource collections and Atompub collections, however, the context doesn’t always help. One reason for that is that it is not at all uncommon for people to refer to feeds as “Atom collections”, even when they are not doing anything with Atompub. While this can lead to confusion it is technically still correct. But when someone starts talking about “Atom collections” when what they really mean are “Atompub collections”, but they are not properly ordering the entries in the collection feeds, that is when things can break.
Simply put, all Atompub collections are feeds, but not all feeds and not all resource collections can be said to be Atompub collections. I personally use the term “Atom feed” when I am referring to an Atom format representation of a resource collection and I use the term “Atompub collection” when I am referring to a true Atompub collection.
The new consumer of Atom/RSS feeds
June 20, 2008 on 3:35 pm | In Atom, web20 | No CommentsJust this week I was talking with some colleagues about Atom and its applicability to representing content types other than blog entries and news articles and at the same time a query came over the atom syntax mailing list asking about feeds of non-blog content for use in mashups. Aside from mashups being one of the sexiest application development paradigms out there today, I’m happy to have them on the scene because they represent a class of use completely distinct from the first consumers of feeds. The original consumer of feeds was the human. Sure, there was a program that humans used to help parse the feed (the feed reader) but that parsing was generally the only automation that sat between the feed and the human. So, of course, the type of information in that feed was directly targeted at the human.
- GData: a feed including items found with the search term rest web service. (gdata API info)
- Flickr: an atom feed of publicly available photos - you would typically constrain the feed with a targeted query (flickr services info)
And moving a bit further into the realm of application consumers of feeds, OpenDS offers an Atom feed over the contents in the directory server.
Momentum is definitely building.
WOW - Photojournalism at its best
April 16, 2008 on 5:09 pm | In Misc, Uncategorized | 1 CommentI got there via a rather convoluted path (that I’ll spare you), and I cannot believe that I hadn’t seen this before but Chris Jordan is truly a talent and one with a strong social conscience to boot. Wow. His current work, called Running the Numbers: an American Self-Portrait, (not yet complete) is to produce art, not from a single photograph (though he does that brilliantly as well - see his work on Katrina), but by building a collage. What is unique about his work is that the number of objects he is depicting in an image, sometimes in the millions, is tied to some statistic - i.e. the number of paper grocery bags used in the US every hour (1.14 million). Some of the pictures are fairly uniform, depicting sheer volume with little form, in others he creates form from arrangements of said objects, often with a degree of irony (i.e. paper bags depicting trees).
While the pictures really do tell a thousand words (or sometimes, as Mark Twain would espouse, far fewer), I found this review of his work, with a few extra words from him quite interesting - in particular, view the interactive feature.
Have a look - you won’t be disappointed!
The JSON and XML debate and what worries me most about JSON…
September 12, 2007 on 7:53 pm | In JSON, XML | No Comments…my concern is not one of features of JSON - please read on.
Okay, so I’ve spent a fair part of the afternoon today surfing through dozens of articles that address JSON vs. XML. And like any other “vs.” debate, these articles express capabilities of each and there are obviously “things” that one addresses differently than the other. I completely agree with what many have said, that it isn’t a matter of one being better than the other, rather they serve potentially different needs. The choice then on which to use when you are designing a system is dependent on what the usage scenarios are. Specifically in the JSON vs. XML discussion, how is the response message (that will come in JSON or XML or perhaps some other format) to be processed by the client?
There are a few people in the blogosphere who have commented this this affect, pointing out for example that the XML toolset allows for processing beyond deserialization, but I’m afraid that this point is getting lost in the noise. Mind you I loved Dare Obasanjo’s
Updated: XML Has Too Many Architecture Astronauts and while I confess to going too high in the atmosphere at times, I hear the point loud and clear that we cannot argue away something as popular as JSON with an architectural discussion - it’s popular because it enables something that users want (and no, I am not implying that JSON is fundamentally flawed and needs to be argued away). Absolutely right - JSON is seeing tremendous uptake because it makes things easy for a lot of developers. But that does not change the fact that there might be things we want to do with a received message beyond what JSON can support.
I don’t think that there is any argument that JSON, with supporting tools in Javascript and other languages, makes it really easy to take a serialized data structure and parse it, loading the data into a data structure accessible by the client application. What scares me is precisely the fact that this is what a large percentage of the population most often or even always wants to do. JSON is so popular because it allows developers to stay in their comfort zone.* Again, before you throw flames, I don’t mean to imply that the preferred programming paradigm, which remains largely procedural, is always bad, but I am saying that in certain cases there is something better. (*As an aside, this also scares me in the world of SOAP-based Web Services - WSDL is used to generate client side structures and XML is just used to transport the data over to the client structure.)
I love XSLT. I was telling a colleague recently about some of the things I’ve built with XSLT and he suggested that XSLT was a relatively rare skill set. Boy, I hope that isn’t so. When I first encountered XSLT I was sooo psyched - you see, I am a functional programmer at heart, having studied programming languages at IU under Dan Friedman - we did EVERYTHING in Scheme. I REALLY like the notion of declarative programming. I much prefer to express what I want done over precise details on how to do it. I am more than happy to have some framework (i.e. the Scheme interpreter) just “make it so.” As to the prevalence of XSLT use, I do think it is used a fair bit, AND, coming full circle back to the JSON/XML topic, there are a lot of programs where what I want to do on the client side (I don’t necessarily mean in the browser) is simply, or first, transform content, maybe into HTML for rendering, maybe into some normalized data format, maybe something else. Why do so many developers still insist on procedural programming when there is an alternative? A failing of XML/XSLT? Maybe. In any case, this is what scares me the most about JSON.
Admittedly I am an XML bigot and therefore probably fit the profile of someone who considers it “my precious XML” (as Dare puts it). I will, however, reiterate that I am also pragmatic enough to see the popularity and the value of JSON and when appropriate it will have its place in my designs. While David Megginson posted In praise of architecture astronauts (an excellent post in many respects) primarily to support his argument that JSON and XML are really not that different, they are both tree markup languages, he also shares my viewpoint stating
“In various situations, one syntax may have an advantage due to software support — for example, web browsers have built-in support for parsing XML or styling it using CSS, and they can convert JSON directly to JavaScript data structures using the eval() function…”
Exactly the point!
Upgraded Wordpress
July 16, 2007 on 6:20 pm | In Misc, web20 | No CommentsI finally got around to upgrading my Wordpress version - if I had known that it would really be as easy as they said it would be I wouldn’t have waited this long. The upgrade was really not more than the 5 steps described here. It almost took me longer to find my DB username and password than it did to do the upgrade. First class organization there at Wordpress!! And with the upgrade the RSS feeds are now working. I’ll get to Atom as soon as I can.
Web 2.0 and SOA World
July 12, 2007 on 6:44 pm | In ServiceOrientation, web20 | No CommentsA couple of weeks ago I attended the SOA World Conference and Expo in New York City. I gave a presentation on interoperability for content management; I’ll post on that topic again very soon, starting with my SOA World presentation but in short, it’s about standards and they need to be service oriented. More on that later.
What I wanted to talk about today was the dominance of the Web 2.0 theme at that show. Admittedly, I did attend a lot of talks in the Web 2.0/Ajax and SOA track so my perspective may be a bit skewed, but it was also present in many of the general sessions and keynotes. Is a Web 2.0 track at SOA World a cheap ploy on behalf of the conference organizers to get on a “hot” topic? Absolutely not! Let me take one step back in order to take several more forward.
SOA is fundamentally about loose coupling, heterogeneity and composition. It also involves governance, messaging infrastructures and sometime registries. It shouldn’t be lost on too many folks who have been working in the software industry for more than a couple of years that many of the themes of SOA are not totally new; we’ve been talking about abstractions, reusability and development methodologies for as long as I have been in computing (and I’m sure much longer than that).
When I started my career 20 years ago I was programming embedded systems – image processing, we were processing 30 frames per second. Everything I coded ran on a single processor. Sometimes there were a couple of processors that ran things in parallel but I remember timing functional components on the target processor and then manually arranging what would run on which processor and it what order. We were, of course, distributing the processing load across CPUs so that it was relatively even and and aligning processing so that a “module” depending on output from some other process wasn’t left waiting on results for too long. We did have a bit of shared memory but managing access was a pretty tractable problem in that only two or three processors were accessing it and we pretty much knew when they would be relative to one another. If I remember correctly we implemented some simple semaphores and they worked just fine.
My code development was governed, I modeled and abstracted plenty of things – things were decoupled where they needed to be, we had a way of multiple processes communicating with one another. Heck we even had virtualization – we called our virtualizations “simulations”. My point is this. What changes all the time, and often very radically, is the environment in which our applications run and that environment continues to become increasingly complex. Where I once was sure that all of my processes mapped their code abstractions to memory the same way, I am now concerned with how my data models are represented across processes because those processes are implemented in different languages and are running on different platforms.
Okay, so SOA is an architectural approach that operates in an environment where we have a distributed network of computers, running a variety of operating systems, hosting a variety of applications, implemented in a variety of languages, oh and they all have to talk to one another. Whew – this is not your grandfather’s computing environment. But then here’s the thing. For the most part, SOA has been applied in the enterprise. The enterprise, I would argue, is still a constrained environment. Not as constrained as the embedded environment I coded to 20 years ago but it still has some constraints. Compared to what? Compared to the world wide web.
Ah so now I finally get to the point on what it is about Web 2.0 makes it an appropriate topic for SOA World 2007. The “enterprise” IT systems aren’t just in the corporation anymore (haven’t been for a while) – partners, vendors, customers all need to connect with the enterprise. I won’t make this already very long post any longer by arguing that utilizing the infrastructure of the web to make those connections is compelling – I’ll assume we all agree on that one. So taken in this context, where our operating environment is the web, the parts of Web 2.0 that are relevant to an SOA discussion are the architectural ones. (There are other Web 2.0 concerns like the emergence of the “procumer” that I won’t cover in this post, though I’m sure to in the future.)
Ajax and RIA Sure we want to improve the user experience while running applications on the web but there is more. Just because the point of integration is not necessarily deep within the enterprise, or even happening on servers doesn’t mean that governance goes away. Is there governance in the Web 2.0? Maybe. Yes, in places. Part of it comes through standards – Atom, for example. But there are also a lot of “accidental architectures out there”. We’ve got some work to do here.
Mashups Yes, I specifically address this separate from Ajax because mashups to me have more to do with the notion of composition than they do with where the composition is happening. A colleague of mine recently created a mashup and reported (no surprise to me) that the process was painful in places – specifically the data models of the two “modules” he was composing were either poorly or entirely undocumented. Rob High had a slide where he listed some core tenets of SOA – right after the line that said “loosely coupled” was a line that said “strongly coherent”. In the Web 2.0 world we aren’t quite there yet.
Agile development and deployment Frequent releases or the perpetual beta. How do we govern in an agile environment?
Heterogeneity Yes, okay, been addressing this with SOA for some time but what is new in the Web 2.0 is that my runtime is the web. What type of web services runtime can I depend on in the wild, wild web? Am I all about REST? What about SOAP and WS-*? As an industry we are working on these answers.
These topics and then some kept the conference very interesting.
Atom 1.0 Overview
May 30, 2007 on 1:44 pm | In metadata, XML, Standards | 1 CommentI’ve been looking at Atom a bit and just as it seems to be with many all standards, the first thing is to figure out where things are in the standardization process. I thought I had it figured out a couple of weeks ago, and started a blog post on the subject (which I never quite finished) but now I stumble upon more. Something pretty significant as it relates to Atom, surprising that I didn’t find it before (read, “how can I find related content if someone hasn’t explicitly placed the relationship in the content that I do know about?”)
Anyway, here is the overview that I’ve managed to build:
- While the text version of the document is the only normative one, I have found this HTML version to be much friendlier on the eye
“Originally, I was searching for an acronym. One that perhaps combined words like Proposal, Atom, Change, Enhancement. In the process I found a word that connoted forward motion[1], consensus[2], and peace[3].” - Sam Ruby
Web Services enabling the non-programmer
March 17, 2007 on 7:06 am | In ServiceOrientation, web20 | No CommentsWe are headed in the right direction! After years, actually decades, of talking about “high-level” programming languages we have achieved the first major increment since the time where programmers were given a means to program in something other than assembly language.
I’m a computer scientist. I love programming languages. The more sophisticated (complicated) the better. Okay, so I am (and many of you our there are) a bit odd. People I meet outside of work, however, don’t usually peg me for a developer, which I like to think is a sign that I’m still at least a bit grounded in reality. I see the value in enabling a broader audience to build the tools that they need.
Twenty plus years ago when I was a freshman in computer science at Cal State University Northridge the high level language was something like Pascal. I did take a course in assembly language (”in what?” say you 20-somethings ;)) and Pascal was certainly much “higher” than that. Less than a decade later I was at Indiana University studying programming languages and we were looking at things like Smalltalk. While Smalltalk goes a bit higher level, arguably we are still targeting someone who is a programmer.
Did we get there with web services? I say not quite. For the first couple of years that I worked with web services, in the early 2000s, I used to refer to the “mythical” business analyst who would be composing web services into the higher level services, processes and applications they needed. My calling them “mythical” wasn’t a slight on the analyst, rather it was a slight on those of us that were providing them with the tools they needed. For example, somewhere around 2002, eRoom (later acquired by Documentum, and then by EMC) released it’s first web services interface. We talked about composite applications then and the tremendous advantage that these components offered. To leverage the interface you first used the WSDL to generate (say Java) proxy classes and now that you had a web services framework in play that offered run time support your life was “easy”. Yeah, right. Okay, the software developers were happy but the business analyst who could now build up higher value components was still mythical.
Enter Web 2.0 and things have finally changed. You want the evidence? Check this out.
I am working on a presentation for our upcoming EMC Software Developers Conference and on one of my slides I am talking about the soon to be released Documentum Foundation Services - a web services based interface to our content management system. Right now the slide only has text and I just want to put some type of graphic on it in an attempt to keep the neurons of my audience firing. Just a little icon that represents web services - what I want is simple, the equivalent of the COM lollypop diagram for web services. When I google “web services icons”, the first hit sounds promising “Web Services Icons +” and I have to tell you I’m thrilled. Not because I found what I am looking for but because I found something better! The icons I found are to things like Flickr, RSS, del.icio.us and digg and it dawns on me that this is the right view of web services for a far greater number of people than the numbers of traditional programmers out there. The icons for these web services don’t say a thing about how the services are implemented, no WSDL or REST, no SOAP, etc. but these things mean something to a great many people who are doing some very cools things with these services - truly creating composite applications and higher value!
So my search for the right icon for the programmer continues… but what is the right icon? Hmm….
Time as a service
February 23, 2007 on 12:22 pm | In ServiceOrientation | No CommentsThe United States is changing its daylight savings time (DST) dates this year to begin three weeks earlier and end one week later than they have in previous years. I’m sure that everyone agrees (actually, I’m more sure that someone disagrees, preferring to control this entirely themselves) that it’s nice not to have to remember to change our time on numerous devices such as personal computers, mobile devices or even watches when DST changes occur. Each of these systems has a calendar, that has the DST (and other events), coded in enabling the system to automatically update clocks for us. Now those windows have changed and this is causing all sorts of problems in applications that were constructed and deployed before this decision was made. In a few weeks my calendar will think that it is an hour earlier than I (and my many colleagues) think it is. Coordinating meetings for this three week period is already proving challenging.
This is an ideal example of something that should be packaged as a service. Sure, there still exist devices that are not connected to the cloud (my watch, for example), but for those that are connected, establishing the correct time, relative to location/timezone is something that should be provided as a service. This should be the last time that changes such as this should cause such a major headache.
Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^