Cornelia's Weblog

my sporadically shared thoughts on, well, whatever is capturing my attention at the moment.

Bonus track: Deploying the Echo Service via the service_broker

Part IV in a three part series ;-)

In Part II I went into a little bit of detail on how the echo service gateway and node work, in collaboration with the actual echo server. Or rather, in this case, how the node does not work with the echo server at all.  The echo_node implementation simply looks up some metadata values from it’s config file and returns them, via the gateway, to the requesting party (the dea).  You might question why the node was in there at all.  The answer is that in this case you don’t really need it.

Included in the vcap-services repository is an implementation for a service broker which can be used to act as a gateway to services that are running outside of cloud foundry.  Recall from part I in this series that the echo server, the java application that listens on a port and parrots back what it hears there, really doesn’t have anything to do with cloud foundry; again, not even the echo_node communicates with it.  So really, you can think of it as a server running external to cloud foundry.

The service_broker is a service_gateway implementation that stands up a RESTful web service that implements all of the resources that any service gateway resource does; resources that allow you to create or delete a service instance, bind or unbind it from applications, etc.  In this case, the gateway implementation simply offers a metadata registry with entries for each of the brokered services. In a “normal” cloud foundry service implementation, in response to these web service requests the gateway will dispatch a message to NATS, a node will pick it up, fulfill the request and communicate back to the gateway, which in turn responds to its client.  In the case of the service_broker there is no node and hence no need to dispatch messages onto NATS.  The web services of the gateway simply look up  the values in its registry and return them.  This begs the question of how things get into that registry; the service_broker gateway offers a RESTful web service resource for the registry itself, supporting POST to add things and DELETE to remove them.  You can either issue those service invocations yourself or you can use the service broker cli.

So here’s what I did to go through the exercise of deploying the echo server as a brokered service:

Step 1: Start with a vcap devbox, following the instructions you find in the vcap repsitory, EXCEPT that you also need to start up the service broker; see this stackoverflow thread for how to do that.

Step 2: Register the echo service with the service_broker.  I used the service_broker_cli which will already be on the devbox you set up in step 1.  Running the cli with no arguments will register what is found in the services.yml file found in the config directory, so I ran it as follows:

bin/service_broker_cli -c config/echobrokeredservice.yml

with the contents of the echobrokeredservice.yml as follows:

---
service_broker: http://service-broker.vcap.me
token: "changebrokertoken"
service:
  name: echo
  description: cloud foundry sample service
  version: "1.0"
  options:
    - name: service
      acls:
        users: []
        wildcards: [*@emc.com]
      credentials:
        host: 192.168.1.150
        port: 5002

Notice the “credentials” section – there are the same values that were in the echo_node.yml file in part I but now instead of the echo_gateway dispatching to NATS and the echo_node simply returning the values from the echo_node.yml file, the service_broker_gateway just looks those values up in its registry and returns them.

When you now run a

vmc info --services

you should see the echo service listed.

Step 3: Deploy your client app and bind to the echo_service.

Seriously.  That’s it.

But before you think, “why did I go through all of those other hairy steps with the node and gateway and BOSH, etc.?,” remember that this echo server is just a sample, and a very, very simplified one. Servers that run within cloud foundry (and are hopefully deployed with BOSH) benefit from management, monitoring, logging and other cloud foundry capabilities, and the gateway/node combination is a loosely coupled mechanism for offering lifecycle operations on those services.  A very valuable part of the cloud foundry services story.  That said, there is clearly value in brokering external services and that’s why you have this bonus track. :-)

The Echo Client

So just a bit more on step 3 from above. The Echo client application that was posted all the way back on the original support forum article is a bit rigid.  It accesses the details of the bound services through the VCAP_SERVICES environmental variable, and that’s fine, but it then looks for a service type named (exactly) “echo” and an instance name of (exactly) “myecho.”  This hard coding always bugged me but because of the way that brokered services are named as a part of their configuration, I finally did something about it.  It required some changes to JsonParseUtil.java where I now take in a string and look for a service type name and instance name that contains that string.  So in an updated version of the echo client source code that you can find here, the client will look for a bound service with the type and the instance name containing the string “echo”; so the service type of “echo_service” with a name like “echo_service-a302″ fits the bill.  Oh, and one other slight add to the app – this one now prints the contents of the VCAP_SERVICES environmental variable – helps when you are first getting started with such things.

Using the Cloud Controller REST interface

When starting out with Cloud Foundry you are likely to use vmc for most of your communications with the cloud, but under the covers, vmc just fires off HTTP requests to the cloud controller REST interface.  This REST interface is completely legitimate for you to use, either “by hand” with something like curl, or from within your applications using an HTTP client.  As is the case with much of Cloud Foundry at the moment, documentation on this interface is somewhat (okay) completely lacking, so you have to be a bit creative in figuring it out.

I’ve just spent a few hours sorting through some of the idiosyncrasies of this Cloud Controller REST interface and here’s what I’ve learned.

  • There are a couple of people who did start to document this interface as a community effort; it hasn’t been updated in some time but it seems to still be accurate.  It’s here.
  • One of the best ways to figure out the REST resource is to run vmc with trace turned on.  There is a decent post on stackoverflow that describes this in a bit of detail.
    • One “gotcha” on this is that the trace doesn’t show you everything – in particular some of the details I list below.
  • There are a handful of resources that do not require authentication.  Things like:
    • Info and the services and frameworks sub-resources; you can GET these resources at /info, /info/services and /info/runtimes, respectively.
    • User token:  By POSTing to the /users/{userid}/tokens you can get back the user token that is needed for many other resoruces.  For example, the following command will return my token.
      curl -X POST -d '{"password" : "mypassword" }' api.vcap.me/users/cornelia.davis@emc.com/tokens

      Note that this returns something of the form: “bearer …. and a whole bunch of characters ….” – the “bearer ” is part of the token.

  • Notice that the POST to get the user token didn’t require any particular headers – this will change as we progress through some of the other requests.
  • Now try doing a GET on /info, but with the header “Authorization” set to the user token.  You will notice that the json returned is slightly different, namely you will see that you are now authenticated.
    {
      "name": "vcap",
      "build": 2222,
      "support": "http:\/\/support.cloudfoundry.com",
      "version": "0.999",
      "description": "VMware's Cloud Application Platform",
      "allow_debug": false,
      "frameworks": [...snip...],
      "authorization_endpoint": "http:\/\/uaa.vcap.me",
      "user": "cornelia.davis@emc.com",
      "limits": {
        "memory": 2048,
        "app_uris": 4,
        "services": 16,
        "apps": 20
      },
      "usage": {
        "memory": 1792,
        "apps": 4,
        "services": 2
      }
    }

    Notice that no other headers need be set at this point – just the Authorization one.

  • With the Authorization header set you can now GET a bunch of different resources (this is just a sampling):
    • GET /services
    • GET /apps
    • GET /apps/{appname}
    • GET /apps/{appname}/stats
    • GET /apps/{appname}/crashes
  • As soon as you want to get more details on a particular service, however, you will need another header, or two.
    • Don’t ask me why – as I said some idiosyncrasies – but in the case of services there is this bit of code that I finally dug up in the cloud_controller/app/helpers/services_helper.rb file that validates that the content type header is set to JSON for services requests.  By the way, the error message that is returned if you don’t have this header is:
      {"code":100,"description":"Bad request"}

      So set the Content-type header to application/json and you will now be able to access the service offerings which you could not before.

      • GET or POST /services/v1/offerings: This gives you a list of the system services but with a bit more information than you get with /info/services.  In particular, you will get the URL for the service gateway.
    • And finally, there is a bit more of a drill down into the details of a particular service but in order to get there you need one more header: the service token.  This value is provided in the X-VCAP-Service-Token HTTP header and what the value is depends on how you’ve configured tokens for your service; in a BOSH deployment it is probably a part of your deployment manifest, in a vanilla vcap install, it is probably set in the cloud_controller.yml file.  So now you can access
      • GET /services/v1/offerings/{service-label}/handles: Note that this gives information about the service nodes, including things like the host IP address and the other things (credentials) returned by the service node.

    So my final curl in all it’s glory is:

    curl -H Authorization:bearer ...the rest of my token... -H Content-type:application/json
         -H X-VCAP-Service-Token:changeme api.vcap.me/services/v1/offerings/cassandra-1.0/handles

Deploying a service to cloud foundry via BOSH

In this last of a three part series on learning how to add services to a Cloud Foundry cloud we’ll deploy the echo service into a BOSH-based deployment.  In part II you’ll find a more detailed description of the parts of a system service implementation, and also a description of and link to an updated version (updated from here) of the echo server itself.  If I’m doing my job right, with this post you should have an “ah ha” moment or two – as I already mentioned, I went through the exercise of learning about cloud foundry services in exactly the order mirrored with this series of blog posts, and a lot of things came together for me in this last step. So, let’s get started.

I’m going to roughly follow the instructions posted here – a BOSH release for the Echo Service. As I went through this exercise I was working off of an older version of this repository, with an older version of the documentation, where after cloning the repository you copy things from this directory into your cloud foundry release.  The latest instructions point out that BOSH now supports having multiple releases for a single deployment, a way to modularize a deployment, so you no longer have to copy things into a single directory structure for the cloud foundry deployment.  There is, however, something to be learned by copying things, so I’ve decided to keep this post in the older style to allow me to sprinkle the process with some explanation – I’ll refer to the steps as described in the older version of the docs.

Step 1: We already had a BOSH-based deployment of cloud foundry running in our lab.  We started with the cf-release posted here and modified it so that it consumed a few less resources (you would think as EMC that we would have all the vBlocks we need, but then you would be wrong ;-) ) ; before adding the echo service we were running 34 vms.

Step 2: Clone the repository (https://github.com/cloudfoundry/vcap-services-sample-release).

Step 3: Copy the job and package directories into your cloud foundry release.

cp -r vcap-services-sample-release/jobs/* cf-release/jobs/
cp -r vcap-services-sample-release/packages/* cf-release/packages/

If you haven’t already dug into the primary portions of a bosh release, here’s a brief explanation:

  • Packages describe all of the bits that will make their way onto the VMs that will run the service.  Every service I have looked at or built myself has had at least a spec file and a packaging file.
    • The spec describes what is required for that service component – dependencies on other cloud foundry packages (like ruby or sqlite) or files that are a part of the cloud foundry release. This tells bosh during deployment to copy these artifacts onto the VM that will run this component.
    • The packaging file is a script that runs after all of those bits have been delivered to the newly provisioned VM.  It usually will involve things like untarring a file and moving the resultant bits into the appropriate location on the VM.
    • Some packages will also have a prepackaging script that is run during the compiling of a package, before the VM is even provisioned.
  • Jobs represent the things that will be run on a VM and the files are generally start scripts and configuration files.  What is really interesting here is that those start scripts and config files are found in a subdirectory of the jobs directory called “templates.” The fact that these are templates allows you to instantiate them with values at run time, allowing you to do things like supply IP addresses of running machines at the point where that IP address is actually known.

There are two other major pieces of a BOSH release: 1) the blobs (which I’ll get to in a moment) and 2) the source tree containing code bits that make up the pieces of a package (mentioned in the package “spec” above).  I won’t say much about the latter in this post except that for the echo service, and all the base cloud foundry services, those bits get into your cloud foundry release via some git magic – it’s all in the ./update command that you do after cloning the cf-release repository. This draws the pieces for those services, the node and gateway implementations, from the vcap-services repository.

Step 4: In this step you are asked to put metadata for the echo server blob into the …/cf-release/config/blobs.yml file. This step isn’t needed at the moment, and in fact, the latest version of the docs for this sample release does not include it.

Step 5: Add echo to the list of built in services by modifying the cloud_controller.yml.erb file adding ‘echo’ to the line that starts with “services =”.

At this point the instructions tell you that you can do a bosh create release and a bosh upload release but there is one critical step missing – what about the actual EchoServer-0.1.0.jar? How do we get it running on one of the BOSH managed VMs?

I mentioned above that in addition to the packages and jobs portions of a cloud foundry release, there are also blobs.  For cloud foundry services these are generally the tar/zip files that contain the actual servers that will provide the service capabilities; the postgresql-9.0-x86_64.tar.gz file or the redis-2.2.15.tar.gz, for example. For our sample service this is the EchoServer-0.1.0.jar file. There are a number of ways that you can structure your cf-release leveraging git to make this perhaps a bit more elegant, but for now we’ll just do the brute force:

  1. Create the echoserver directory in the …/cf-release/blobs directory.
  2. Drop the EchoServer-0.1.0.jar file from part II in this series into that new echoserver directory.

(Looking into several of the echo service files that were copied over into the cf-release you can find reference to that jar file in places like …/cf-release/packages/echoserver/spec, …/cf-release/packages/echoserver/packaging and …/cf-release/jobs/echo_node/templates/echoserver_ctl.)

During the bosh create release this jar will then get included in the tar ball that is subsequently uploaded to and deployed into the cloud.

Okay, so now for the good stuff. In part II of my series I promised you that some of the ugliness around coordinating the command line arguments for running the Echo Server with values in the echo_node.yml file would get better with BOSH.  You see, BOSH is now responsible for running both the Echo Server (starting it with a java command) and the echo_node, so there must be a way that we can coordinate these two things.  There is.

The single place that we will put values that will then be used by the Echo Server and the echo_node is in the deployment manifest.  Under the properties: section you need to include the following:

  echo_gateway:
    token: changeme
    ip_route: ***.***.***.***
  echoserver:
    port: 5555

Then you have to see to it that the Echo Server and the echo_node pick up the port value appropriately.

Echo Server

In the …/cf-release/jobs/echo_node/templates/echoserver_ctl file you will find the java command that runs the echo server:

exec java \
    -jar EchoServer-0.1.0.jar \
    -port <%= properties.echoserver && properties.echoserver.port || 8080 %> \
    >>$LOG_DIR/echoserver.stdout.log \
    2>>$LOG_DIR/echoserver.stderr.log

Enclosed in the <%= %> is a template expression (using ruby’s erb feature) that pulls values from the deployment manifest.  But our Echo Server also takes in an IP address so we need to update this execution to the following:

exec java \
    -jar EchoServer-0.1.0.jar \
    -ipaddress <%= spec.networks.default.ip %> \
    -port <%= properties.echoserver && properties.echoserver.port || 8080 %> \
    >>$LOG_DIR/echoserver.stdout.log \
    2>>$LOG_DIR/echoserver.stderr.log

echo_node

In the …/ cf-release/jobs/echo_node/templates/echo_node.yml.erb file you will find the port for the echo server specified; recall from Part II that the echo_node oversimplified so as to just return the port number that is specified in the _node.yml file.

port: <%= properties.echoserver && properties.echoserver.port || 8080 %>

Of course, now you can see that the port in this config file is drawn from the same source as the port supplied to the Echo Server when it is started.  Something you had to coordinate manually is now handled by BOSH.  Coolness.

Step 6: NOW you can do the bosh create and upload.  Because we were updating an already deployed release we need to do:

bosh create release –force

Followed by

bosh upload release

Step 7: And while we have already updated the deployment manifest with the properties for the node and gateway, you also have to update it to include the two jobs that will be part of our cloud foundry deployment. Note that each VM gets a single job, but that the echo_node job launches two processes, the echo_node implementation and the actual Echo Server. The following are roughly those parts taken from our deployment manifest; your mileage will vary depending on how you configured your cf-release deployment. Under the jobs: section:

- name: echo_node
   template: echo_node
   instances: 1
   resource_pool: infrastructure1
   persistent_disk: 128
   networks:
   - name: default
     static_ips:
     - ***.***.***.***

 - name: echo_gateway
   template: echo_gateway
   instances: 1
   resource_pool: infrastructure1
   networks:
   - name: default
     static_ips:
     - ***.***.***.***

Oh, and we increased the size of our “infrastructure1” resource pool by 2.  Of course, you’ll have to update the ***.***.***.*** IP addresses appropriately.

Now do Step 8:

bosh deploy

You should now be able to push the same echo app as posted in Part II of the series.

Have fun!

The Anatomy of a Cloud Foundry System Service Implementation

In the first post of this three part series, I went through the steps we took to deploy the sample Echo service to a non-BOSH-based, single node cloud foundry instance, and in the last part we’ll do the deployment into a BOSH-based cloud foundry. In this post I want to spend just a bit of time explaining the parts of the Echo service itself.

A cloud foundry service implementation consists of three main components:

  • The service itself: This is the actual running service, for example, for Postgres this is a set of processes that implement the running database.
  • The service node: This is the code that provisions and deprovisions (and a few other things) cloud foundry service instances. In the case of Postgres, when a service instance is created, the code in the service node will make calls to the Postgres server itself, creating a new database, user and password. Attributes, such as the server address and the new database name, user and password will ultimately be passed back to an application that is binding to the service.
  • The service gateway: This is part of the mechanism for that “ultimately” – this code presents a RESTful service for creating and deleting service instances and gets those requests to the service node for fulfillment. When the node responds with values, the gateway returns them to the original requestor.

While my intent with this post is to mainly explain the Echo Service components, not to provide a general tutorial on creating cloud foundry services, there are a few general tidbits I do want to share right away.

  • For the service itself, you are likely not writing any code here, rather, you are probably taking some piece of software and deploying it (how depends on whether you are going into a BOSH-based deployment (part III) or not (part I)).  For example, we’re working on creating a Cassandra cloud foundry service, so the Cassandra implementation is just downloaded from here.
  • For the gateway, you have very, very little code to write.  The code you do write is in Ruby and leverages a bunch of cloud foundry code.  The gateway just extends these cloud foundry provided classes setting two things:
    • The service_name (this is actually set in a provisioner class that is then included in the gateway class) which is also set in the node implementation (via common base class included in both) and is used for communication between the node and gateway (messaging over NATS).
    • The name of the config file.
  • Most of your work will be in the node implementation.  This is written in Ruby and leverages a bunch of cloud foundry code. Your task here is to communicate with the server itself to do whatever needs to be done on provision and deprovision requests.
    • When you go into a multi-node cloud foundry deployment, your service node code will typically execute on the same VM as the actual service –if there are multiple service VMs, there will be multiple node processes running (one on each vm).  You’ll probably have one or two gateways – running on VMs separate from the node/service VMs. Think of the service gateway as a router of sorts to the set of nodes actually providing a service – one thing, the gateway is only called into action on provisioning and deprovisioning requests, not when an application is communicating with the service.
    • The gateway and node communicate via messaging (NATS) embedded in cloud foundry.

There is a lot more to say about this, but that’s a topic for another post.  In the mean time, I encourage you to study not only the Echo Service, but do look at a more real implementation such as Postgres.

So, back to Echo.  The original instructions are a bit confusing as they first deal with deployment of the node and gateway, getting them running, even before there is a service for the node to speak to.  I’d probably address the actual service first before worrying about connecting to it with the gateway/node – just sayin. In fact, let me start there.

The Echo Cloud Foundry Service

The Echo Server

The service itself is found as an attachment in the original article – the echo_service.jar attachment – and it is super simple.  It is a java program – you can find the source in the echo_src.zip attachment of the original article – and it’s a single java class.  When you run this service it simply listens on a particular port and when something shows up on that port, it just takes that string and sends it back, over the same socket.  The code as you originally find it will listen at the ip address 127.0.0.1 on the port passed in when you run the jar.  The following line of code is the one that gets that localhost IP address in the method call on InetAddress:

serverSocket = new ServerSocket(port, 0, InetAddress.getLocalHost());

This didn’t work for me. My client app (which I talk about in the last section of this post) was sending out messages on the actual IP address of the box, not 127.0.0.1, so the echo server never saw them.  To solve this, a colleague of mine and I modified the echo server to take in, on the command line, the server ip address as well as the port.  Here you’ll find both the new jar file and the source.  If you use this new Echo Server implementation instead of the original, when you start up the server make sure you include the ipaddress and port as follows:

java –jar EchoServer-0.1.0.jar –ipaddress 192.168.1.111 –port 5002

We’ll come back to this in part III when the startup of the echo server is automated.

The echo_node implementation

The echo_node implementation has the job of servicing provisioning and deprovisioning requests.  In a “real” service this would likely involve communicating with the Server to change its state (i.e. create a new database) and/or retrieve some values (username and password for the database), and then this information would be returned to the requesting party.  For Echo it’s much simpler.  There is nothing to create within the server and the echo server doesn’t expose any type of interface for asking about its state – it might, for example, have offered an API that returns the ip address and port for the socket it is listening on – but it doesn’t.  So the echo_node implementation does not communicate with the echo server at all and instead simply returns values to the requesting party.  Where does it get these values?  Mostly from the echo_node configuration file.  In part I, roughly following these instructions, the echo_node.yml includes the following two lines:

port: 5002
host: 192.168.1.111

The host property is no longer required – cloud foundry automatically includes the host and name properties in the values returned from the services node.   That said, you need to make sure the ip address you provide when running the echo server is, in fact, the ip address of the machine the echo_node is running on because that is the ip address that will eventually be passed to the client application.  And if this all seems a bit confusing, don’t worry, it will get far better when we deploy with BOSH – yep, part III.

The echo_gateway implementation

If I were to stay only with talking about the echo gateway here there wouldn’t be much to say.  Most service gateways are pretty much the same.  They present an HTTP (RESTful?) service for provisioning, deprovisioning, binding and a few other operations, they dispatch messages to NATS and wait for the service node to do its work and respond.  The response is then made available to the requestor.  As I said before, all that code is pretty much included in cloud foundry – you just have to do what I described earlier in this post. But let me give a very brief overview of what all of that lovely, provided-for-you code is doing.

  1. It presents an HTTP (RESTful?) service.
  2. It sets up a listener on NATS for responses it expects to get from the service node.
  3. It processes the request and dispatches the appropriate message to NATS (in step 2 we set up the mechanism for receiving a response).
  4. And when it gets that response, it in turn responds to the HTTP request that started the whole thing off.

There is a good bit of code in there that does things like handling the getting back of multiple values for a particular key and converts them into an array, and other goodies like that.

So, who is the recipient of the gateway response?  Well, ultimately it is the client application, and that is what I will talk about next.

The Echo Client

The echo client is also quite simple.  It’s a plain-old java web application – not spring.  Cloud foundry knows how to reconfigure certain types of applications by looking at the artifacts for that application type, like spring config files, finding common patterns, like the use of the javax.sql.DataSource interface for database connectivity, and setting values, like database server ip addresses, in there. It’s the cloud foundry stager that does this. But for a plain old java web app, the set of things that cloud foundry knows how to configure is limited to some basic things in the web.xml; beyond that, it has no idea how that app is configured. Does the app look for a property file? Or does it look things up in environmental variables? Dunno.  So in this case the dea, who has the values that came back from the service gateway, just writes them into environmental variables .  In fact, the dea always writes these values into environmental variables regardless of whether the stager does anything extra with any of those values.

And that is how the echo client is consuming them.  If you have a look in the source for the echo client, you’ll see the following in the index.jsp:

String services_json = System.getenv("VCAP_SERVICES");

The services_json is then parsed to find the credentials object which in turn contains the host and port.

Summarizing, the echo_node obtained the host and port from its config file, which until part III you are responsible for keeping in synch with the arguments you provide when you run the echo server.  Through the echo_gateway, the dea took those values and wrote them to environmental variables, the client picked them up from there and configured itself to send messages over the socket at that ip address and port.

One point to emphasize: once the service instance is bound to the application, the node and gateway are out of the picture – at that point the app communicates directly with the echo server.

Got it? :-)

It takes a bit of study but in the end it all makes sense.

Have I mentioned that I love this stuff?

Learning how to add services to Cloud Foundry

There are a lot of things we want to do with Cloud Foundry, at the moment we are focusing on adding services such as Cassandra.  We’ve been studying the code for existing Cloud Foundry services (i.e. Postgres) and we’ve deployed the sample Echo service, a couple of times.  This was a pretty significant investment that is now set to pay off as we really feel like we know how to tackle the problem.  What I’m going to do in a series of posts is first (in this post) update the instructions you can find here on deploying the Echo service to a non-BOSH based deployment of Cloud Foundry, second more fully explain what the Echo service and client are doing and clarify what the various components are, and third, detail what we did to get things running in a BOSH-based Cloud Foundry deployment.  This mirrors the progression of our investigative work and I hope you might find it useful.

There are two versions of a guide on deploying the Echo service (and a client app utilizing it) – one on the support site and another in github. They differ slightly but both are about the same vintage (September 2011) and a bit out of date with the current Cloud Foundry code.  Also, I had to make some changes to the Echo service itself to get things running.  Here are the details – I’ll generally work from the version that is on the support site:

Create Single Node Cloud Foundry Instance

First things first, I did this deployment of Echo to a single-node Cloud Foundry instance that I installed using these instructions (as a services developer, micro-cloud foundry is NOT the local cloud solution).  I started with the Ubuntu 10.04 desktop image simply because I have been spending a fair bit of time poking around the contents of Cloud Foundry files and doing so with multiple windows is nice; you could also use the server version . The Cloud Foundry install went pretty smoothly, though I did have to restart it once because one of the required resources wasn’t accessible (reported “Could not reach http://rubygems.org/”) the first time through; that is,  after  seeing an error, I simply executed the following command again and things continued to a successful install.

bash < <(curl -s -k -B https://raw.github.com/cloudfoundry/vcap/master/dev_setup/bin/vcap_dev_setup)

Do a quick check to make sure it’s all installed and working properly – start the server (step 3 in the instructions) and push an app – something like this one is super simple.

Make a snapshot of your image now.

Adding Echo

I’ll go into more details in the second part of this blog series, but there are three pieces to the Echo service; the echo gateway, which services provisioning requests (i.e. as a part of a push), the echo node, which does the actual work of provisioning the service, and the service itself.  All three are covered in the original doc, but it’s not entirely clear what is happening where – I’ll try to crisp that up a bit here.

Echo Node and Echo Gateway

In the original doc, this is what is happening in steps 1-8 – here are my updates to those steps:

Step 1: Do just what they say. One thing to note, however, is that when you are adding your own service, not echo, to a non-BOSH based cloud this will not be enough.  There is an innocuous little piece of the cloud foundry code base (I would call this a bug, but it’s an engineered bug) that has service names hard coded in it.  At https://github.com/cloudfoundry/vcap/blob/master/dev_setup/lib/vcap_components.rb#L399 (and a few more lines below it) you’ll see that there is a list of services – and you’ll note that echo is already in that list; that is, echo is already sort-of included in the base cloud foundry vcap code base.  If you are starting with something net-new, this is one place you will also have to make changes.  I’ll try to get to a separate post on this little gem, but in short, the names included in this list are used in the directory paths to the node and gateway bits of the service – that is how start and stop scripts are found, for example.

The good news is that this engineered bug is gone in BOSH-based deployments of cloud foundry.  Makes sense, BOSH is concerned with the components of the deployment, cloud foundry itself no longer is.  More on that in part III.

Steps 2 and 3: Do them just as described, though the lines for step 3 are already present in that file when cloud foundry is installed.

Step 4 is no longer needed.  In fact, the bin/services directory no longer exists in the current version of cloud foundry vcap.  These files were the executables for the service gateway and service node and were just proxies for the files that held the real startup details.  It looks like the startup mechanism changed from earlier versions.  Now, the aforementioned vcap_components.rb file has encoded into it THE place that the executables for the node and gateway are to be placed – convention over configuration!  That convention is as follows: Given the <name> of a service the node and gateway executables are placed as follows:

.../cloudfoundry/vcap/services/<name>/bin/<name>_node
.../cloudfoundry/vcap/services/<name>/bin/<name>_gateway

Step 5: This step involves two things 1) dropping the node and gateway implementations in the right location and 2) configuring them appropriately.

As I already mentioned, the echo service is mostly already included in the cloud foundry vcap code base so the first part already done for you.  I took a close look at the implementation that is in the vcap repository and compared it to that which is in the echo_sp.zip file linked from the original article, and it looks to me like the code that is part of vcap is more up to date.

For configuration, you are first instructed to copy the echo_node.yml and echo_gateway.yml files from the …/cloudfoundry/vcap/services/echo/config directory to the …/cloudfoundry/.deployments/devbox/config directory.  While this step isn’t absolutely essential because on startup if the .yml file isn’t found in the latter location it will search the former, making the copy begins to give you an idea of how a BOSH deployment handles this – think of the copies under vcap… as a template that is instantiated into the .deployments… location; more on this is part III.

When you are editing the yml files in the .deployment… directory, the instructions are quite out of date – there are more required properties for those components than there were a bit more than a year ago; even the .yml files for echo that are part of the vcap source (which is what we are using) are not up to date.  The echo_gateway.yml should read:

---
cloud_controller_uri: api.vcap.me
service:
  name: echo
  version: "1.0"
  description: 'Echo service'
  plans: ['free']
  default_plan: free
  tags: ['echo', 'echo-1.0', 'echobased', 'demo']
  timeout: 15
  supported_versions: ['1.0']
  version_aliases:
    "current" : "1.0"
ip_route: localhost
index: 0
token: changeechotoken
logging:
  level: debug
mbus: nats://nats:nats@<nats_host>:<nats_port>/
pid: /var/vcap/sys/run/echo_service.pid
node_timeout: 2

And the echo_node.yml should read:

---
plan: free
capacity: 100
local_db: sqlite3:/var/vcap/services/echo/echo_node.db
mbus: nats://nats:nats@:<nats_host>:<nats_port>/
base_dir: /var/vcap/services/echo/
index: 0
logging:
  level: debug
pid: /var/vcap/sys/run/echo_node.pid
node_id: echo_node_1
port: 5002
host: <host_ip_address>
supported_versions: ["1.0"]
default_version: "1.0"

Note the comment in the original instructions that indicates a preference for IP addresses.  Hint – have a look at one of the other service _node.yml or _gateway.yml files and grab the mbus value from there.

Steps 6 & 7: Exactly as written in the original

A this point you have the echo_node and echo_gateway running, BUT the actual echo service is not yet running.

Echo Service

The echo service is a very simple java program that simply starts listening over sockets on the port you supply as a command line argument.  In the original instructions this was the very last step (I thought it a bit odd that the instructions had you push an app before the service was available – it’s okay, but a bit odd) – that is, step 3 in the section on “Consuming the echo service” (again, a misleading title as only steps 1 & 2 are part of the consumption; step 3 provides the service).

I wasn’t able to get the ssh tunneling (step 4 in the cloud foundry vcap installation instructions) working on my machine, instead I put entries for *.vcap.me in my dev VM hosts file, so I had to modify the Echo service to open the socket on my IP address of 192.168.1.xxx instead of on 127.0.0.1.  More details to follow in part II of this series.

But then I started the service with:

java –jar echo_service.jar –port 5002

Echo App

Okay, so now we are ready to deploy an application that consumes this service.  The original instructions have you do it in two steps, provisioning a service and then binding it to an application during the push.  You can do all of this during the app push.  The app.war posted with the original article works fine, provided you have everything configured properly; if not, you’ll probably get a stack trace (again, I’ll cover this in the second part of this series).  A couple of things to be careful of while you are doing the push.

  1. Choose java, not java7 for the java version – the app is complied with 1.6.
  2. When you provision the echo service, make sure to use the name myecho for the name of the instantiated service.

And that should do it – browse to echotest.vcap.me and give it a shot.

Customizing Micro Cloud Foundry?

I love micro cloud foundry.  I spend a fair bit of time on airplanes, and I hate not being able to work because my internet connection is flaky in some hotel room so being self sufficient when I am doing dev is critical. So good thinking, and thanks Cloud Foundry team.

Micro cloud foundry is designed to be the self-standing companion to cloudfoundry.com and this recent blog post indicates that the team is working to keep micro-cloud in synch with cloudfoundry.com. Coolness.  But what happens when I want a self-standing “micro” version of cloud foundry that doesn’t match up with cloudfoundry.com, but rather a customized cloud – one that, say, has additional services available?  Off the top of my head I can think of two basic approaches – 1) Customize micro cloud foundry or 2) Create a single node cloud foundry to run locally and install your customizations in there.  I’ll explore #2 a bit more in another post, right now I want to poke on #1 a bit.

I have spent many afternoons over the last couple of weeks perusing my micro cloud VM including reading a lot of ruby code and from that have reached a few conclusions (please speak up if you think I’m misguided on any of these):

  • Micro cloud foundry is created by the folks at VMWare as a BOSH release that puts all the components on a single node.
  • In addition to the cloud foundry components it also has a few bits specific to micro cloud foundry like the code for the console and offline mode.
  • The directory structure that contains all of the bits is complex – lots of links from parts of the tree to other parts of the tree.
  • On first startup entries are created in the embedded sql-lite DB – these are essentially installing, or better put, enabling certain parts of cloud foundry – for example, the available services have entries placed into the database.
  • Hacking micro cloud foundry to add additional services (i.e. for Cassandra) is pretty much a no go – I tried.  This is why the cloud foundry team created BOSH – doing these types of things by hand is awful. And I don’t even know all of the steps that need to be taken even if I was crazy enough to try it anyway.
  • I should point out that existing documentation for how to create custom services basically hacks into an existing deployment – none of those instructions match up with micro-cloud and the docs for adding services to BOSH-based deployments starts with “get the cf-release” – more on this in a second.

I’m convinced the right approach to creating a new version of micro cloud foundry is to get the micro cloud foundry bosh release, add my new service in there, do a bosh deploy and then take the resultant VM as my micro cloud.  In other words, pretty much what that doc for adding services to a BOSH-based deployment prescribes.  I’ve been digging around github, found repositories for micro, services-base and services but haven’t been able to find anything like mcf-release (a version of cf-release specifically for micro).

What I am trying to do here is an important part of the whole PaaS development paradigm – anyone else interested?

I’ve posted a short question here – keep an eye there for any responses.  I’ll keep posting on the subject as well.

Building Git on Ubuntu 11.10

I struggled a bit with this today but as is often the case, one little tweak (okay, really two) and I was good to go.  I did lots of googling in the process and found many sources for information on building Git from source but most were a bit older and out of date in one way or another.  And, of course, different flavors of Linux also account for some of the differences.  I did start with the “Installing from Source” from the Pro Git free book, and after a few modifications ended up doing exactly the following with success:

  1. Get git dependencies.  The apt-get command to do this given in the Pro Git book simply didn’t work for me – I got a “Unable to locate package …” for each of the packages on that line.  Instead I executed the following command:
    $ sudo apt-get build-dep git-core
  2. This, however, did not get the openssl files so I executed the following to get that package:
    $ sudo apt-get install libssl-dev
  3. Next I downloaded the source from https://github.com/git/git.  I didn’t find a tar file there so clicked on the ZIP button.
  4. Since I got a zip, not a tar I extracted the files using Ubuntu’s Archive Manager.
  5. Changed into the directory I extracted – i.e. /home/cornelia/git-git-5976753
  6. Then executed the following two commands:
    $ sudo make prefix=/usr/local all
    $ sudo make prefix=/usr/local install
  7. Now a
    git --version

    shows me:

    git version 1.7.12.GIT

WS-REST 2012

This post is a bit late – in fact two months late, but what I want to tell you about is interesting enough, the goodness knows the topic remains relevant enough to warrant carrying on.  So here goes.

On April 17 was the third annual WS-REST Workshop, a WWW Conference workshop.  I didn’t attend the first one in Raleigh in 2010, but I did make it to Hyderabad for WS-REST 2011 (why go in my neck of the woods when I can go clear around the globe instead? ;-) ).  In 2011 I presented a paper on our XML REST Framework, which allows a developer to implement RESTful services using an XML technology stack, from persistence to model and controller, with the framework providing views with some Java implementations.  This framework remains relevant in the work we do in the corporate CTO office, we use it regularly in proof of concept implementations and everything we do in the Architecture Group is around Web and RESTful architectures.  We are even still getting interest from outside of EMC.  In our 2011 paper, we emphasized that our framework puts hyperlinking front and center, something that is missing from most of the popular REST frameworks today – i.e. CXF, Jersey, Spring MVC.  Sure, a developer can still craft their own hyperlinks, but without making it a first-class concern in the framework, it is often not addressed.

I do a lot of evangelism within EMC, and increasingly in broader circles, about what REST is and its importance, and while I think I’m making SOME headway, it still seems like the set of individuals who really get REST is still microscopic.  And what scares me the most is that with enterprise software moving to the cloud, REST has never, ever been more relevant in the circles I spend most of my time in. The paper I presented at the WS-REST workshop this year is an expanded version of my REST pitch.  I explain the fundamentals using the familiar World Wide Web as the example, and then I cover some of the mistakes often made in so called RESTful services.

But enough about my papers, what I want to share are my perspectives on the workshop overall.

While the workshop was super, the group that participated was talented and delightful, and there were some really good papers, I’m afraid that I came away a bit worried that, despite our efforts, the REST “community” isn’t getting enough traction.  First, overall I thought the program in 2011 overall was stronger than it was this year.  Last year we had a really great keynote from Stu Charlton – this year, none (I’m still looking forward to meeting some of my REST heroes like Jim Webber, Ian Robinson or Steven Tilkov at one of these things – come give us a keynote please :-) ).  There were slightly fewer papers (despite getting just as many submissions, I think).  Attendance was healthy but not overwhelming.  I was expecting quite the opposite – that with all the hype around the cloud that there would be many more papers and a larger group of interested participants. It’s absolutely the case that there are many competing sessions, even on the workshop days, at the WWW Conference, but I still think there is more we need to do as a community in terms of broadening interest.

But now let me move on to something that is very positive:

There was a section of the program that was called “REST and the Semantic Web” – this is a really good thing.  Last year one of my favorite papers was presented by Kevin Page, REST and Linked Data: a match made for domain driven development, where he summarized, “hey, these things are so closely related – why aren’t the two groups talking.”   This section of the program, which held my favorite paper of the workshop, Functional Descriptions as the Bridge between Hypermedia APIs and the Semantic Web, presented by Ruben Verborgh, is perhaps evidence of some progress.  Another activity that makes me optimistic is the formation of the Linked Data Platform Working Group at the W3C, with a charter that includes:

The mission of the Linked Data Platform (LDP) Working Group is to produce a W3C Recommendation for HTTP-based (RESTful) application integration patterns using read/write Linked Data.

This group was formed as a result of unanimous agreement from the participants of a W3C workshop on Linked Enterprise Data Patterns, held in December last year.  The WG is seeded with a member submission lead primarily by individuals from the Rational group within IBM.  The first meeting of the WG was just this last Monday, and active participants, particularly from industry, are very much being sought.

Passing parameters to an XQuery or XSLT through an XProc Pipeline

(syndicated from original posting on the EMC Community Network)

I’ve recently been helping someone get started with the XML REST Framework, and they asked about getting query string parameters into the XQueries or XSLTs that were a part of their resource implementation.  For those of you who are not familiar, what the XML REST Framework does is leverages Spring MVC for all of the HTTP protocol stuff, a developer builds a very thin Java shim to an XProc pipeline that implements the RESTful service.  The post I link to above and two earlier versions explain this all in more detail.  What I am focusing on in this post is how I can get values from the query string into the right places within my pipeline; there are two parts to this.  1) you need to get your hands on the query string argument and 2) you need to pass that into the XProc pipeline in such a way that it gets to the XQuery or XSLT.  It turns out that #2 was already demonstrated in the Patients.java class file in combination with the resourceGET.xpl.  Let’s drill in on that a little bit.

If you take a look at the top of the xpl you see the following code – really declarations of the inputs and outputs for the pipeline.

<p:declare-step name="main" xmlns:p="http://www.w3.org/ns/xproc"
     xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
   <p:input port='xqueryscript' />
   <p:input port="stylesheet"/>
   <p:input port="stylesheetParameters" kind="parameter"/>
   <p:input port="xqueryParameters" kind="parameter"/>
   <p:output port='result' sequence='true' primary='true' />
   <p:output port='error' sequence="true">
      <p:pipe step='checkXquery' port='error' />
   </p:output>
…

You’ll note two different input ports for parameters, one called stylesheetParameters and the other xqueryParameters; these were names of my choosing (just like parameters in any other language would be.) A parameter input port is named and the value is a list of keyword value pairs; that’s right, a parameter input port actually carries a set of key/value pairs (Norm Walsh, XML guru and one of the lead authors on the XProc spec says “[XProc] parameters suck.” – they are kinda weird and we are thinking about changing things in the next version of XProc).  In this example, I built the pipeline to keep the XQuery parameters separate from the XSLT ones; if I wanted to have a key/value pair passed down into both my XQuery and my XSLT, at the java level I could just add that pair to both parameters.  So speaking of that, our framework allows you to add parameters to the pipeline input with a call to the addParameter method:

pi.addParameter("stylesheetParameters", new QName("baseURL"), request.getRequestURL().toString());

Going back to the xpl then, let’s look at how the input parameters to the pipeline make their way down into the XQuery or XSLT. In each case, we just pass the parameter from the main pipeline down into the appropriate step.  XProc is nice that way and while the current parameter design might feel a bit odd, it does enable this easy, easy approach.

For XQuery:

<p:xquery name="xquery">
   <p:input port='source'>
      <p:document href="xhive:/" />
   </p:input>
   <p:input port="query">
      <p:pipe step="main" port="xqueryscript" />
   </p:input>
   <p:input port="parameters">
      <p:pipe step='main' port='xqueryParameters'/>
   </p:input>
</p:xquery>

And for XSLT:

<p:xslt>
   <p:input port='source'>
      <p:pipe step='xquery' port='result'/>
   </p:input>
   <p:input port='stylesheet'>
      <p:pipe step='main' port='stylesheet'/>
   </p:input>
   <p:input port='parameters'>
      <p:pipe step='main' port='stylesheetParameters'/>
   </p:input>
</p:xslt>

So then the only thing that remains is how to get your hands on the query string argument (#1 of the two things we needed to do) and because our framework uses Sprint MVC for our REST protocol layer, this is in the Java code. The following method could be added to the Patients.java class of the sample application, to show this:

@RequestMapping(method = RequestMethod.GET, value="/alt")
@ResponseStatus(HttpStatus.OK)
public String getPatientAlt(HttpServletRequest request, HttpServletResponse response, Model model, @RequestParam("pid") String pid)
 throws XProcException, TransformerException, IOException {

  try {

    PipelineInputCache pi = new PipelineInputCache();
       // pass the patient ID into the pipeline for use in the xQuery (to look up the right patient)
    pi.addParameter("xqueryParameters", new QName("pid"), pid);
       // supply current resource URL as the base URL to craft hyperlinks
    pi.addParameter("stylesheetParameters", new QName("baseURL"), request.getRequestURL().toString());

    PipelineOutput output = m_getPatient.executeOn(pi);

    model.addAttribute("pipelineOutput", output);
       return "pipelineOutput";
  } finally {
       ;
  }
}

The ONLY difference between this and the original getPatient method is where I get the value for the patient ID that I will pass down into the XQuery.  The way that Sprint MVC does this is with the @RequestParam annotation.  You can see that for testing I changed the @RequestMapping path to be /alt so the URL that will invoke this code is something of the form:

http://example.com/XProcPatientServiceMVC/patients/alt?pid=12345

Note that I didn’t even have to create a new XProcXMLProcessingContext because I just reused all of the implementation from the original get patient – I’m just getting the value for the passed in parameter from somewhere else.

Cleaning WordPress Malware

I was hacked. I cleaned it. I was hacked. I cleaned it. I was hacked…

All told, I think it was 4 times before I stopped the attacks. I don’t like to mess with IT-type issues, least of all security issues, so when something like this comes up, I don’t know what I don’t know, and it takes me some time to learn enough to deal with the issue completely. Before I describe what I did to finally fix things once and for all (knock on wood), let me describe my naivety, because I suspect I am not the only one who suffers from it.

First, I never worried too much about security. With three people reading my blog on a semi-regular basis (hi again Dad), I figured “who would want to hack my site?” Bots don’t care about readership. Second, I figured my password was secure; I had a pretty clever password, wasn’t even in English (and see aforementioned comment on readership). Bots are multi-lingual. When I cleaned the site the first time, I figured I’d be safe for a while – how concerned would the hacker be to reinfect my little old site again so quickly? Bots work 24×7. And there are lots of them. No, I didn’t really believe any of these false assumptions, you think about the issue for, oh, maybe a microsecond and it’s clear exactly how naive these assumptions are. So beware – if you have a site on the World Wide Web, you are a target for hackers.

The First Sign of Trouble

I first became aware of the hack from my Dad (see aforementioned comment on readership) when he said that Google had reported to him that my site was compromised. I now believe that my site had been compromised for far longer than I knew – it was only when Google checked it out and started reporting its findings that I had been alerted. I’m not sure exactly when Google launched this service but, based on my experience, I am betting it was within the last 6 months or so. After I cleaned my site the first time I used Google Web Master Tools to request a re-evaluation of my site – you must have a google account. I got a clean bill of health but alas, a couple of weeks later I was blacklisted again; sure enough, the infection was back. I’ll say more about ALL of the steps I took to finally rid the site of the infection “for good,” but first, kudos to Google – nice service those web master tools; easy to use, seemingly complete. Good stuff.

How I fixed it

Fortunately, in the end, the cleaning procedure is very straight-forward (disclaimer: This statement applies to the type of attack that I specifically had. I have no doubt that there are other hacks that are MUCH more difficult to clean up.) You don’t have to download any special software (you will need an ftp client, but you should have that anyway), or generate any long log files that are then posted to some forum where a wizard will then give you five other free software packages to download and run (those things always scare me away). You need to be able to edit files (really simple edits), change some file permissions, and set your password for the site. Oh, and one other comment: security by obscurity is never a good idea (see aforementioned comments on bots) – sites that suggest you remove all mention of “WordPress” from your pages are being naive, even after thinking about it for far longer than a microsecond.
Okay, so on to the steps:

  1. Edit the files. The type of infection I had was really simple – some of my .html files had a line or more of the following form inserted into them:
    <script src=”http://some.domain.com/somefilename.js”></script>
    I was lucky that in most cases the lines were at the top of the file, though in once instance the hack was a bit more clever and inserted the line more centrally located. BTW, there are lots and lots of examples on the web about more clever insertions that are more obfuscated, but if you are editing the files by hand, even those should be relatively easy to spot. So, I edited each of the infected files (see below on the list of infected files I had) and deleted these lines.
  2. Set file permissions. In a WordPress site, the .html and .php files typically have permissions that allow the owner and group read/write access and the world to have read only access. As a result of the hack, the file permissions for the compromised files allowed the world write access. My web site hosting company doesn’t provide any interface that allows me to view or change file permissions so I used Filezilla – you’ll need to set the file permissions numerically – you want to set the value to 644.
  3. Make sure you get all infected files. I had files infected both in my WordPress and in my phpMyAdmin directories. The full list of files I cleaned (from the root of my directory structure) is:
    • index.html
    • /blog/index.php
    • /blog/wp-admin/index.php
    • /blog/wp-admin/network/index.php
    • /blog/wp-admin/user/index.php
    • /blog/wp-content/index.php
    • /blog/wp-content/plugins/index.php
    • /blog/wp-content/themes/index.php
    • /blog/wp-content/themes/classic/index.php
    • /blog/wp-content/themes/constructor/index.php
    • /blog/wp-content/themes/constructor/home.php
    • /blog/wp-content/themes/default/index.php
    • /blog/wp-content/themes/pool/index.php
    • /phpMyAdmin/index.php
    • /phpMyAdmin/main.php

    Of course, your mileage may vary, particularly in the themes area – check all of the php files in the themes you have installed for files that have permissions other than -rw-rw-r–.

  4. Change your password. I confess (in the hopes that my ineptness will benefit someone else), the first few times I cleaned my site I didn’t do this. I believe that changing my password after the cleanup is what finally twarted further attacks. Change your password, and change it to something that can’t be found in a dictionary somewhere.