Cornelia's Weblog

my sporadically shared thoughts on, well, whatever is capturing my attention at the moment.

Posts Tagged ‘CMIS’

Building Domain Specific RESTful Services with XML Technologies

Some colleagues and I have been very busy in the last months putting together a framework to do just that. Let me briefly back up a bit…One of the things that I’ve spent some time on in the last couple of years is the Content Management Interoperability Services (CMIS) standards development effort. One of the styles for exposing this set of services is through a RESTful interface. Through incremental improvements happening through the community development process the result is a pretty nice, and quite RESTful, interface. That said, Roy Fielding levied some pretty harsh criticisms toward it when it was first made public – the vast majority of which are now addressed. One issue from Roy’s list remains but is, frankly, something that CMIS didn’t intend to address, and that is the concern that the services exposed over a CMIS compliant repository do NOT expose the domain specific semantics in the resources. Roy is, of course, dead on on this, what CMIS exposes is the repository model, not your domain specific model. Yes, CMIS does provide a service where object types can be obtained by a client, but those details are not a part of the protocol, rather they are tunneled through the protocol. And, while CMIS resource representations do include hyperlinks between related resources, again the related resources are repository resources (i.e. this document is contained in this folder) not domain specific resources.This is an all too common pattern – we build interfaces that expose the repository models – and, of course, this is not at all without use as it does help consumers more rapidly build applications. Unfortunately, it has the unintended consequence of encouraging architectures where all of the domain specific details are all too often included far too far from the server. My boss calls this the “semantic gorp”, and in the best case it lands on an app server somewhere embedded in some services that in turn call the repository-model-centric services, but in the worst case it ends up far further away. This causes all sorts of headaches, which I’ll go into details on in subsequent posts… for now I’ll just call on your intuition to buy that point.(Okay, so that wasn’t entirely brief, but important context nonetheless.)In the deployment of ECM systems you go through a process where you design your domain specific artifacts, content types, lifecycles, etc. You then (ideally) configure the ECM to provide services for those domain specific entities. Wouldn’t it be cool if you got domain specific services as a result of that process? Now, don’t misunderstand, we haven’t built that (yet), but what we just published to the EMC Developer Network is a step in that direction.We have built a framework that allows for the development of domain specific RESTful services over a repository that supports xQuery. In particular, the implementation available for download operates over EMC Documentum xDB. Using the framework, all of the details of connecting to xDB are abstracted away from the developer, and he or she produces the RESTful services by

  1. “declaring” the resource model in Jax-RS annotated Java classes
  2. defining the uniform interface with more Jax-RS annotations (i.e. @GET, @PUT, @POST, @DELETE, etc.)
  3. producing an XQuery to interface each RESTful operation for each resource to the physical manifestation of that resource
  4. and binds it all together in a spring configuration file

In this first release of the framework we are mostly leveraging Jax-RS for REST features, which means that one of the most important features of RESTful interfaces, hyperlinking, is lacking. This will follow in version 2 of the framework in a few weeks.One of the primary drivers for this work was to simplify the development process for these services – if they are easy to build then we should see more domain specific services where otherwise, generic services might have been. This is already proving true in that we are leveraging the framework on several projects internally at EMC. The developers were able to concentrate on their resource models, representations and interactions rather than getting bogged down in the details on how to implement them. Sure, they still need to touch java code and spring configuration files, but it’s not a huge leap to see that tooling could shelter the developer from those details – we’re just focusing on the developer model, not the tooling(, for now?).The framework, including source code, plus a sample application leveraging the framework is available here. Also, for any of you that will be at EMC World in a few weeks, I am giving a talk on this framework – in the highly coveted last slot of the conference in the developer track (Thursday afternoon at 1PM). Hope to see you there.

Atom extension for hierarchy

There has been a bit of a debate going on in CMIS on this topic and I’m interested in thoughts from members of the REST and/or Atom community so am posting a synopsis here. The CMIS mailing list is open to the public and you can see the most recent thread on the subject here and from there can find other postings if you are interested enough to dig that deep. I give a pretty thorough overview here if you don’t have the cycles to follow the links.The subject at hand is how to represent a hierarchy of entities in atom. Atom, of course, has a feed and an entry but there is no mechanism for embedding a feed either in another feed, or embedding a feed inside of an entry. I’m not really looking for feedback here that questions whether hierarchical representations are a good idea or not – CMIS has made the decision that they need them – so assuming we are going to represent them, the question is, what is the best approach.Over the last couple of months we’ve spoken with Nikunj Mehta who is the author of an I-D on In-lining Extensions for Atom that defines a mechanism that could be used for hierarchical arrangements by embedding representations as child elements to the atom link relation. Because the requirements driving the I-D and CMIS differ, and because it seems that the I-D will likely take a fair bit of time to reach consensus, the CMIS TC has decided to create their own extension with an intent to replace this with an applicable standard once one is available.All of that context established, the CMIS TC has considered several options and has down-selected to two – those that we are calling option 3 and option 4 (options 1 and 2 already having been dismissed).What we refer to as option 3 is one where a folder entry has a cmis:children element that contains multiple atom:entry elements. In outline form it looks like:<atom:entry>  <atom:title>Folder A</atom:title>  ...  <cmis:children>    <atom:entry>      <atom:title>Folder B</atom:title>      ...    </atom:entry>    ... more atom entries ...  </cmis:children></atom:entry>What we refer to as option 4 is one where a folder entry has a cmis:children element that wraps an atom:feed element that then contains multiple atom:entry elements.<atom:entry>  <atom:title>Folder A</atom:title>  ...  <cmis:children>    <atom:feed>      ... a bunch of feed stuff ...      <atom:entry>        <atom:title>Folder B</atom:title>        ...      </atom:entry>      ... more atom entries ...    <atom:feed>  </cmis:children></atom:entry>Option 3 has at least two advantages over option 4:

  • Option 3 will have a smaller representation because it does not include the feed and the elements that are required of a feed (i.e. the “bunch of feed stuff” shown in option 4 above)
  • Option 3 results in a simpler server side implementation because the feed needn’t be created.

In spite of these advantages I believe that option 3 has one significant disadvantage over option 4, and that is that we fundamentally have two different representations for the same resource, requiring that client developers must have two implementations of a function depending on how the resource representation was retrieved. Let me explain with an example.Suppose a client developer has written some code to display certain things about a children collection – they want to display the name of the collection, the last updated value, a list of the children and they want to have a button that allows the user to select this location as a target of a new item creation. This code executes against a feed retrieved via the URL to that child collection (URL could have been bookmarked for example). The pseudocode for this is something like:void processCollectionResource(feed childrenResource) {  String title = childrenResource.getTitle();  Date lastModified = childrenResource.getUpdated();  // get the URL of the link relation with rel=”self”  URL postURL = childrenResource.getLinkURL(“self”);  Iterate childrenResource.getEntries() {    // so something with each child    //      (this will be the same in both cases)  }  // do something with all of this – render, etc.}Now the client realizes they want to do the same thing but against a set of children that are embedded within a hierarchical representation. Let’s first look at the pseudocode for option 4.There is code that has to parse the child collection out – it is something like the first line of:feed childrenResource = folderResource.getChildren().getFeed();processCollectionResource(childrenResource);The processCollectionResource method can be used as is.For option 3, I need to pass in the folder resource and start navigating from there, because some of the information I need is at the folder and some of it is in the cmis:children element. So I’d make a call something like:processCollectionResourceEmbedded(folderResource);And the new code to process this slightly different representation is:void processCollectionResourceEmbedded(entry folderResource) {  String title = folderResource.getTitle();  Date lastModified = folderResource.getUpdated();  // get the URL of the link relation with rel=”down”  // NOTE!!!! that the code is less self contained here – self  // is more direct than down  URL postURL = folderResource.getLinkURL(“down”);  Iterate folderResource.getChildren().getEntries() {    // so something with each child    //      (this will be the same in both cases)  }  // do something with all of this – render, etc.}At the root of this difference is that with option 4 we are treating the set of children as a full-fledged, stand alone resource. This allows us, for example, to have different metadata for the children collection as for the folder itself (the CMIS domain model doesn’t make this distinction, however, option 4 would allow for this which is goodness). The children resource representation is self-contained – I don’t have to go up to the containing folder to find out something about the collection. Option 3 doesn’t really treat the children resource as a complete resource – it depends on the folder resource to describe something about the children collection resource.I’m sure it’s not lost on anyone that with option 3 we are creating a new, proprietary (to CMIS), feed-like container mechanism with the cmis:children element instead of using the standardized one that already exists. I don’t think that is a good idea.Finally, and perhaps most subtle, is the fact that with option 3 we are really treating the document that is retrieved as the unit of importance, because we are requiring more than just the representation of the children collection in order to process it. I am extremely interested in seeing CMIS provide support for more than just the document web, pushing into support for the linked data web. Atom is designed to support it – I don’t want to loose that with a poorly designed extension to atom.Sure, embedding a feed duplicates some of the values from the folder entry to the feed (for CMIS), but I think that is a reasonable trade for the simplicity and elegance that it offers the client.