Rob's Ruby Revelations and Random Rants: September 2010

Wednesday, September 29, 2010

Requirement / RPM Hell

For those of us that have been around long enough on Red Hat distros, we remember RPM Hell. A cyclic series of RPMs that depended on each other, thus RPM Hell. Trying to find the one link that would let you install the rest of the packages was often a exercise in frustration that made you think that your career path should have been in the area of Great White shark research and not software development. Our upcoming retrospective, and this XKCD made me think of RPM Hell.

There is a bit of back story require in order for the rest of this to make any kind of sense.....

Our infrastructure and staff have gotten much better over the course of the last year. When I first started on this team, it was as a solo developer attempting to transform a questionable architecture into something resembling sanely designed system. I had some backup and support from the team's lead, Tim Galeckas <@timgaleckas>, and some direction in the most general sense from the companies CTO, but not much more than that. It was one of those projects where the general gist is "Fix it" where "Fix it" is about all the direction you can expect. As one can imagine, the project was a huge time sink with minimal return.

A few months after that project was terminated, the new effort to rebuild the system in RoR got into full swing. We still had issues of course, nothing phoenixing from a process that severely broken can be without defects. As these issues became apparent, changes came to my little world.

The view in the team is now so divergent from the original it is hard to see how one originated from the other. We have a project manager, a user interface designer, four developers, two quality assurance members, and a true to life development manager. There is even a process in place to score, scope, and select stories <tickets, cards, issues> for development in any iteration. All that being said, there are still problems.

Our primary issues, in my opinion, is not the lack of staff, the backlog of stories, the intra-team interaction, or quality assurance backlogs. It is the quality of requirements that the development team receives. I have been converted from the waterfall style development environment that I first worked in to an agile approach, so complaining about story requirements might seem a bit odd. I do not enjoy a feature request that leaves nothing interesting for me to do. When presented with a story that tells me how to technically solve the issue, visually present the results, and what should be tested I sigh. These stories are what turns us into code monkeys and not professionals. This is a fine line to tread, I want enough information so when I deliver a piece of functionality it is complete, does what is supposed to, looks good, and is performant. What I see as our RPM Hell in my development space is stories that look complete, but have hidden requirements or functionality that is only available if you tap the brain of the primary stakeholder.

My current example is with a story about displaying information from a third party system and how it weaves into the current system. The conversation and stories development went something like this.

End consumers of InvestorBridge what to be able to view fund level information.

Great, no problem. What does that mean?

Well the system that we pull information from has a huge data set of fund level information.

Okay, we can already interface with that system so what information do you want?

It is on the story.

That is something I love to hear. To me that means that I can load up the story, read it, and understand what they want. This is almost what happened. Almost. The story told of things like fund level returns and AUMs (Assets Under Management) and the display of these was to follow current displays of account level information of the same type. Outstanding, easy to do, easy to validate, easy to import.

Where everything fell apart was that there happened to be a document attached to the story that contained additional requirements. That is right, the story had an attachment that modified the context of development. In our process attachments are most often images of expected display or test data. I have never before seen attachments as additional requirements. Requirements should be in plain text on the story. This is the standard and what the developers expect. Where this devolved from a process to a cyclic definition followed by more questions is when the list of additional fields broke the current display model. This brought in our UI guy as well as the project manager. Scope creep was inevitable at this point. New views, click paths, and imports were all required after this document was rediscovered.

This brought to light two things

We, as a team, failed to understand the story and the feature requested
We, as both a unit and individuals, failed to ask questions that would have mad this apparent

This felt like RPMs all over again. If I would have known what RPM was the keystone RPM, I would have been able to easily install the software and understand its dependencies. In much the same way, if I would have known what questions to ask, I would have been able to see the full scope of the story and its underlying implications.

So, how do you know what questions to ask when you don't know that you don't what you don't know? Where to start? Where is my yum for feature requests?

ActiveMessaging and Rails 3

For those of us upgrading our applications to Rails 3 there is something nasty to watch out for if you are also using ActiveMessaging. The Google Active Messaging group covers some of the issues that have been found.

Make sure to take a look at Spraints' fork.

http://github.com/spraints/activemessaging

Tuesday, September 21, 2010

SOAP, REST and XML Violence

A few of our client have requested an API to send data and verify uploaded data. This, in general, is pretty standard. As you grow you will run into clients that are more technically savvy then the others or have larger data sets with more frequent updates. If you are uploading three-hundred documents a week to a given web site by hand, well you begin to look for a better way to do things. Sometimes, and only sometimes, APIs are the way to go.

In the world, there is a huge argument over SOAP vs REST and which is the superior API. I am by no means the authority on the matter but it seems to me that there are different use cases for both of them. Now that my opinion is out in the open, I am going to clarify that position:

I HATE SOAP. Every company that I have been forced to use SOAP at drove me insane. Insane I tell you! The amount of overhead that is required to work with SOAP makes me mad. The other part of this is that during the development cycle, the WSDL kept changing without a revision number. The providers I was working with deemed that since the service was still in a development stage, it was not a requirement to version the WSDL. Every time the system was almost complete and I had conformed to all of the SOAP contracts, the WSDL changed out from under me. This might be the cause of some bias on my part....

In the same breath I am going to defend SOAP for having a contract, something that REST lacks in the formal sense. A decent REST resource can be found here.

REST is great for me as a developer in a shop that needs to have a decent velocity with a small head count. REST snaps right over the top of my already established controllers and with a few modifications to the ActiveRecord models I can customize the output of the .xml request. Dean Radcliffe, one of the other developers has convinced me that storing client configuration for xml output in the I18n files is not a bad idea.

I found a blog who's opinions on API I tend to agree with.
Exert from REST and SOAP: When Should I Use

...Areas that REST works really well for are:

Limited bandwidth and resources; remember the return structure is really in any format (developer defined). Plus, any browser can be used because the REST approach uses the standard GET, PUT, POST, and DELETE verbs. Again, remember that REST can also use the XMLHttpRequest object that most modern browsers support today, which adds an extra bonus of AJAX.
Totally stateless operations; if an operation needs to be continued, then REST is not the best approach and SOAP may fit it better. However, if you need stateless CRUD (Create, Read, Update, and Delete) operations, then REST is it.
Caching situations; if the information can be cached because of the totally stateless operation of the REST approach, this is perfect.

....If you have the following then SOAP is a great solution:

Asynchronous processing and invocation; if your application needs a guaranteed level of reliability and security then SOAP 1.2 offers additional standards to ensure this type of operation. Things like WSRM – WS-Reliable Messaging.
Formal contracts; if both sides (provider and consumer) have to agree on the exchange format then SOAP 1.2 gives the rigid specifications for this type of interaction.
Stateful operations; if the application needs contextual information and conversational state management then SOAP 1.2 has the additional specification in the WS* structure to support those things (Security, Transactions, Coordination, etc). Comparatively, the REST approach would make the developers build this custom plumbing.

Mike Rozlog also makes a good point about XML. XML can be heavy, very heavy if you are transmiting a ton of data over the wire that is very verbose. Mo
on Stackoverflow makes a pointed joke at the cost of verbose XML.

"XML is like violence - If it doesn't solve your problem, you're not using enough of it."

The current application that I am working on has both APIs I am sad to report. One supports the legacy system that feeds it documents and the REST is now being provided to the clients as the API of choice for interactions on a programmatic level. I am happy to say that this is not as horrid as it sounds. The legacy SOAP code handles all kinds of stupid requests and statuses that are not needed by anyone or anything but an ill-conceived piece of stateless Java. As we trasition our clients off of that legacy system, we will be able to DRY up the API controllers, and in this case, remove the API controller entirely as it will have been relpaced by a Restful API.

I will stop ranting now and state, SOAP and REST both have their place in this world, but given the chance I would rather work with REST as a developer. Flicker and Twitter have great examples of REST working well.

--Rob

New Relic

I am attempting to add New Relic instrumentation to ActiveMQ powered poller / processors. Any help or insight that anyone has would be valuable.

New Relic and Active MQ

Monday, September 20, 2010

Ruby on Rails And Soap Hell

To Whomever decided that WSDLs and SOAP should be the Enterprise communication standard,

I should murder you, slowly.....

--Robert R. Meyer

Really? Really!? WSDLs? Come on man. Your killing me here.

One of the WSDLs that I have to support, which we actually inherited from another product, is full of duplications and includes three calls to the same underlying function with optional parameters the only distinguishing feature to separate each of the calls.

The WSDL is 640 lines for a few methods. This strikes me as insane. The best part is that there is nothing DRY about WSDLs. They are by nature the most disgusting blend of XML and .... well something more disgusting... like badly written sudo OO PHP?

The primary issue I have with SOAP is that so many of our consumers use SOAP as their API of choice. I hate this, the definition language, while verbose, is inelegant. There is far too much bolier plate code required to even begin to use the API. Entire commercial solutions exist in order to elieviate this issue. In my experience and opinion, this usually denotes a problem.

Solutions like SoapUI and a few others are designed to generate the required boiler plate code to start using SOAP. Anyone looking at the generated code should realize that two thousand lines of java should not be required to request a list from your data provider. REST is much better in this regard.

One of our primary product's WSDL definition requires 271 lines of java just to use the document upload call. Bit exessive? I would say so.

After spending around four hours working on creating a new, simpler WSDL to expose document upload and meta-tagging, it is time to go home. All I can see is indented xml fragments with custom namespaces defined wherever the origonal creater determined it was best.

Headache? Check.
Code Blind? Check.
Mentally Drained? Check.

Time to go get a beer.

Selenium and Cucumber testing

I got talked into a Tech Talk for our internal conference, I decided if I have to do it, I am going to do it on something that is useful to my team. Thus, cucumber and Selenium. I plan on covering the following points..

Why cucumber
Selenium and ruby / java
front end verification
good / bad practices with cucumber
Transactional Issues
ID Problems

More to come on this later

Rails Conf

Found out this morning that my buddy Jake Scruggs got one of his talks accepted to Ruby Conf. Check out his project: metric_fu

Wednesday, September 15, 2010

VPD and Oracle Scheduled Jobs

A few months ago we had a minor problem, the companies' development Oracle server died. It did something fun and dropped a partition or two and generally went belly up. All things considered, not a major issue. The death of Oracle caused us some lost time mostly due to the fact that most of us were not running Oracle Express which would have allowed us to keep on developing even with a down Oracle cluster. The quick lesson here, run Oracle Express if your box will handle the load.

One fallout of this problem, besides the lost time was that we also lost our primary "Gold Schema." This was a big problem.... The birth of our system was a bastard child conceived by one of our Senior Developers as part of a bet. The conversation went something like this...

S: I really think the attempt to upgrade the PHP application to a newer version is a waste of time.

A: I think it is better then the alternative of starting over on a new platform.

S: I bet I can get a working version of the system on Rails in a week!

A: So do it.

As any of you that have had this kind of conversation know, this was a bad idea on both sides. It lead to the creation of the new system, which was supposed to be a proof of concept, in under a week. Given it was a 60-80 hour week but all the same, code that is rushed like that takes on a ton of technical debt and tends to inherit the legacy system's debt as well. The debt I am currently talking about is that the Rails application's database was a copy of a MySQL database ported over into Oracle on short notice. Using this tactic, we do not have migrations from blank to current state. Not usually an issue with Oracle considering our in-house systems allow us to request a clone of a current schema, but in this case, with the loss of the Oracle schema, we lost our baseline.

Recovering the baseline from one of the developers that was using Oracle Express was straightforward, but we forgot one thing when we made that developers' schema the master. We forgot about VPD. For those of you that don't know what VPD is, be thankful. It is Oracle's home grown security system affectionately called Virtual Private Database. VPD has good uses, and ones that can be transparent to the developer. Things like table level security and filtering as well as constructing database sessions with additional audit information. A few examples can be found here. One it is not good at is row level security, which is what we were using it for.

Jobs are another part of the Oracle schema that we failed to remember at first. Our Database Developer informed me that there are at least three ways you can schedule jobs. We managed to not clear out all of these when we removed VPD from our security model. One of the jobs that was scheduled every fifteen minutes constructed a materialized view of what user was allowed to see what document based on a series of permission levels and tags. Generally you can imagine that this job would not cause any issues, it was just reading from around four tables and constructing a new view. All good. But in this case, things were not as nice as they should have been, you see, there was a bug.

The bug caused this job to lock all of the rows that it was using to construct the view in an attempt to verify that the view it would construct was accurate. Oracle locking somewhere in the range of half a million rows across four tables causes things like deadlocks. At least the deadlock provided a trace showing what had locks on the rows. As soon as we found this, we went through the new "gold" schema and blew away all remaining vestigial VPD operations.

Lessons to be learned:

Make sure your migrations allow you to build a new database from scratch
Verify that migrations remove tasks / jobs / views that are no longer needed and could impact the performance of your system
Do not use a bet as a good reason to create a new production system
Attempt to learn from the last generation of software's sins
Be friends with your DBA and DBD, they can save your ass.

--Just because you have a hammer does not make it the right tool for the job

Monday, September 13, 2010

Windy City Rails

Few of us here at Backstop attended Windy City Rails this weekend. It was a good time with a bunch of good speakers including Jake Scruggs covering a ton of topics and providing a massive amount of info.

During the course of the day WCR had a little project written in Rails 3. We, as a collective coded the Dojo Chat Server. Intresting little toy considering it was written in under 8 hours by a group of people with varying levels of experence and decidation to the project.

Just thought it was an interesting experience.

Rake Tasks Calling Rake Tasks

I am currently working on an export / import to move data from a legacy system backed by MySQL to the new generation system backed by Oracle. Due to the fact that both our DBA (Database Administrator) and DBD (Database Developer) are overloaded,I have been tasked to create something that ports the data over. To do this I created a Ruby project attached to the MySQL server, and pumped out the data in a pipe (|) delimited format with headers.

This rocks for my uses because the main project uses the FasterCSV gem. So using a set of rake tasks I can export the data from the old system, then execute the import tasks to spew the data over to the Oracle backed system.

I was looking around for an easy way to call many rake tasks in much the same way as Capistrano. It turns out that it is drop dead easy.

Check out Calling rake tasks from another rake for details.