Web Project Install Wrecks

June 11, 2010

Working in custom software on web applications for a few years now and one thing which is a continual issue is project installs. Its interesting when your developing non-packaged software, the actual install process is an after thought. Projects have sprawling parts spread across the file system, install step on a application server or web server, and some set of database tables/procedures which many custom shops decide to try to enumerate out into implementation steps. This puts great pressure on the team producing the install documentation to not forget anything and someone to faithfully reproduce the steps, which means that even with a test run it can still be error prone. I think there is a thought that because the software doesn’t ship its not worth while to create automated install code to allow someone to simply install the app and the application take care of additional steps. For simple applications it certainly maybe overkill but if your moving into multiple environments for a single release which many larger projects are going to have to do, its worth the investment.


Think Thread Safety

June 2, 2010

In multi-threaded languages such as Java issues related to thread safety can be the most difficult to debug because they depend on timing during program execution. I would guess that most programmers know about thread safety but they aren’t aware of when they need to be thinking about it. The answer is ALWAYS! Many times they just assume that the framework they are working with will take care of any thread safety concerns but thats not the case. Any time you modify code or are creating new code you need to ask yourself “do I need to be thread safe?”  So why do we get caught is the pitfall today with newer frameworks and how can you tell if your being thread safe?

Pitfall: The Singleton…

With the advent of Spring and lots of other frameworks one of the dominate design patterns is the Singleton. By default for efficiency reasons many frameworks when creating objects, create only one instance that is executed on by all threads calling the code. This means your objects need to be stateless when integrating with many frameworks. For many developers this is tough because we want to make objects with lots of state, or have existing objects with state that we want to integrate into the framework with little to no refactoring. That combined with the fact that the actual call of your method will be wrapped in multiple points of indirection in many cases and it will work when you are testing in the simplest case tricks you into thinking your code is integrating properly when you are not. A through read of documentation in almost all cases for the framework fixes the matter but the reality is most folks learn from tutorials and experience first. Don’t get me wrong thread safety has and will be a concern of great significance whether or not your using singletons, but its one of the major patterns used today making it that much more important.

Thread Safe Or Not?

Now with the pitfall identified, how can you identify what needs to be changed and where possible issues are? Without going into much more detail for some specific situations that won’t be possible and I’ll avoid getting into that amount of detail for post length sake. At least I can provide a general rule you can follow to identify if a class is thread safe and once you have determined when the calling code demands you’ll be able to identify if there is an issue…

If a class has no synchronization and accesses a variable that has state that is not appropriate to share with other executions of the class, then the class is NOT thread safe.

For this rule “variable that has state” would include any variables that can be changed by code external to your class’s methods or changed by the methods internal to the class. This rule ignores usage of execution synchronization to simplify the discussion mainly. Take a look at classes like this and make sure they are being used appropriately and not concurrently by different threads.

Not being thread safe isn’t a bad thing. But when creating or updating existing code, knowing whether you need to create something thread safe or not is the something you always need to take into consideration. If you don’t know if the code calling your code will be calling it in a thread safe fashion STOP and research the callers to determine if you need to be thread safe.

Tech Leading 101: Vision

May 15, 2010
Being a technical lead on a project is a complex position with a lot of demands. Recently I have been thinking about what separates
the good leads from the bad ones and one of the main things that keeps coming up is ‘vision’, having an understanding ahead of time of the project’s technical and non-technical aspects and having goals and ideas on how address them. When you have a great lead it seems like they have anticipated everything and things fit together. There is continuity to the code. A direct result I believe in many cases of a tech leads creating a vision for the project, communicating that to the team, and making sure everyone is working toward the vision. I want to discuss some areas of projects where tech leads commonly lack vision because tech leads get focused on the technicals they forget some important non-technical aspects which are unavoidable.

Areas Lacking Vision

  • Schedule – I know this should be the project manager right? Well yeah but a tech lead needs to anticipate what the best ordering of tasks is dependencies and what resources match best with specific tasks. Being a project where you are constantly realizing dependencies which previously weren’t thought to be there makes development painful in many cases. Yeah changing task order mid stream sometimes is unavoidable but put critical thinking into it at the being of the project and make sure tasks get started off with the possibility of the right order.
  • Task Delagation – Creating a pattern for how tasks are going to be handed off could be a tech lead’s best friend. Just saying start on the task isn’t sufficient. Envision ahead of time how your going to kick other developers off on tasks and what you don’t want to happen. This will allow you to avoid lost time on the calendar and getting pulled in too many directions at once.
  • Reviewing Code – This creeps up on far too many tech lead’s (myself included), assuming you practice having project code reviewed either internally or externally, preventing the ability to react to requested changes. Make sure to mentally anticipate tasks finishing and the who/when of code reviewing. The sooner you catch issues the better and reviews are assurance that team members are completing tasks correctly.
  • Documentation – What the right documentation that needs to come out of a project is what I’m talking about here. understanding not only what useful client documents are needed but also documents for people who will be maintaining the project in the future. More projects than not that I come on after the fact as a developer and I think “I wish I had documentation on this”.
  • Project Communication – Communication isn’t a problem for a lot of tech leads especially when everyone is base out of one office. The situation where creating a vision of how communication will play out is when you have team members in diffent locations. Relying on email just isn’t an option in most cases. Too many times communication becomes a problem because someone on your team isn’t available to communicate in real time. Work on a vision that mitigates those situations thinking about the obvious like meeting times but also the not so obvious like what tasks get assigned to who.
I’m a firm believer in working at the begining of a project to create a vision of every aspect of the project possible so you can work towards it and get the team to work toward it as well (so don’t keep it to yourself). Vision can change as things come up too so don’t chain yourself to the original vision, allow project members to challenge it and challenge it yourself. Afterwards if you feel like it needs to change then change it.

Remember the Abstract Factory Pattern

May 6, 2010

With all the emphasis these days on unit testing it amazes me that I see developers are still dropping bombs like…

  1 public void someMethodThatMyUnitTestHasToCall(){
  2        ... Method Logic ...
  3        SuperComplexObject object = new SuperComplexObject();
  4        object.doSomethingThatIDesperatelyWantToAvoidForMyTest();
  5        // or an alternatively horrific call
  6        DifferentClass.staticMethodCallWhichCallsSomethingElseIWantToAvoid();
  7 }

Nothing kills a unit test faster than near un-mockable behavior in the middle of a method call which you can’t avoid (at least in java).  Types of actions taken in these undesirable method calls and object creations are generally database activity or some type of HTTP access when cause you to write more code to setup a unit test than the code under test itself. Many cases can be refactored with simply making the object an instance variable, however when an object has a lot of state which shouldn’t be retained from call to call thats not an option. I want to remind folks in these cases of the Abstract Factory Pattern which is a very straight forward way to wrap these monsters up. Sometimes people may avoid the exact pattern because it involves creating an extra class and interface in its pure form which sometimes is undesirable especially if you have a lot of classes which need to be hidden,  so I’ll offer a variation on the pattern which hopefully will make it more palatable.

Modified Abstract Factory

The modified abstract factory I’ll suggest is simply a single class that encapsulates the unwanted call. For Object creation it will always be the actual “new Object()” statement because if you can’t get another instance type created your out of luck. In the case of static method calls its the method call itself.

  1 public class ComplexObjectFactory {
  2    public ComplexObject getComplexObject(){
  3        ComplexObject object = new ComplexObject();
  4        return object;
  5    }
  6 }

Injecting the Factory

Now we need to get the factory into the class somehow there are a couple of ways to do this…

  • Pass a factory instance in as new parameter
  • Create a settable field in the class for the new factory

Depending on the situation you could have a other options as well but the one option that isn’t there is just replacing “new Object()” with “new ObjectFactory()” that won’t get you any closer to solving the problem with the suggested solution I’m providing. Now you can see how the old method changes.

  1 ComplexObjectFactory factory = new ComplexObjectFactory();
  3 public void methodCalledByUnitTest(){
  4    ComplexObject object = factory.getComplexObject();
  5    object.complexMethodCall();
  6 }
  8 public void setComplexObjectFactory(ComplexObjectFactory factory) {
  9     this.factory = factory;
 10 }

Unit Test Mocking

Now there is a hook to where you can create an extension of the quasi abstract factory class which returns a mock instance of your class or returns mocked data for a static method call.

  1 ComplexObjectFactory factory = new ComplexObjectFactory() {
  2      public ComplexObject getComplexObject() {
  3           //Mock up the behavior in someway.
  4           return new MockComplextObject():
  5      }
  6 };

All you need to do is set the new factory implementation. This will allow your unit test to focus on the code under test not the code that was tangentially called by the code under test (assuming thats your testing methodology).

Active Preview 2.0 Release!

April 25, 2010

Just dropped a major update to Active Preivew, a wordpress preview that allows you to see your changes as you make them! I have fixed issues with previewing newly created posted and most importantly added AJAX callbacks once typing in the editor is complete to refresh the preview automatically in case you are relaying on wordpress to parse and render any tags or markup.


When to Repeat Yourself

April 18, 2010

One principle in software engineering is “Don’t Repeat Yourself”. It has been around for as long as I can remember and is the hallmark in many cases of good code.  Many times I think developers fundamentally get stuck in this “don’t repeat yourself” mode outside of the code because we train ourselves so much to be constantly thinking about efficiency. Because of this, it creeps into places where it shouldn’t be. And gives developers like myself an excuse to say “we don’t need that document because the information will be in another document” or “just look at the code if you need to know something”. Both are certainly good justifications in some situations, however they don’t always hold up. I want to throw out some ideas for when its OK to repeat yourself to counter balance the efficiency junkies in all of us.

Where is “Don’t Repeat Yourself” abused?

In code its rare where “Don’t repeat yourself” fails because if it didn’t work then the compiler or testers would tell you. That’s the simplicity of code but the places that I see “don’t repeat yourself” being used as an excuse to avoid additional work…

  • Developer to Developer Communication – I’ll single out us developers here but its not necessarily a unique problem. A developer likes to communicate in the most efficient way for himself. One of the main ways we do that is by trying to not have to repeat ourselves when we communicate things. For instance we’ll send out a group wide email or we’ll schedule a meeting to train everyone at once and assume everyone got the message. The problem is you need to repeat communication in many cases because people simply don’t understand (or listen) the first time.
  • Documentation – In some cases the misuse here is trying to combine documents which have similar content but need different levels of detail to cater the correct audiences. The main offense I’m thinking of though is not repeating yourself in documentation because someone can just look at the code. This applies to documentation meant for other developers mainly, especially when someone put work into in-code documentation. The issue isn’t that the source code can’t explain exactly what is happening, it’s that if you plan on having additional people work on the project it increases ramp up time eliminating any time saved previously.

Repeating Yourself = OK

There are a couple of situations which repeating yourself is probably not a bad idea…

  • Different audiences – Anytime you are preparing something for two different sets of people with different perspectives on an application/system, it’s probably a good guess that you’ll need to repeat yourself in either a different format or different level of detail.
  • Buried Information – Many times you’ll have information which is important but gets lost in a larger document or system. When this happens consider repeating it again in a place which is appropriate for the level of importance.
  • Inability to Properly Cross Reference – When you create anything which is dependent upon another piece of information and your not able to actively reference that piece of information for the user to be able to access easily, consider repeating the information. Partial information can be dangerous and leads to issues and confusion.
  • Informal Communication – This is any communication that is not made in a lasting way or requires additional context to understand. Examples of these are a design notes session, training session materials, and many emails actually as well. In this case any communication that wasn’t made in a lasting way needs to be repeated and potentially any supporting documentation may need to be rewritten.

Limited Efficiency

“Don’t Repeat Yourself” is a key concept for being efficient but there is a limit to the efficiency you can gain from it. If you find yourself breaking the guidelines to be able to not duplicate something chances are you just trading efficiency today for lost efficiency down the road. You need to be prepared to repeat yourself for the sake of efficiency of others.

Hidden Gems in Apache Sling

April 12, 2010

One of my favorite open source projects that I’m currently following is Apache Sling. Its a RESTful interface to a Java Content Repository JCR (specifically Apache Jackrabbit out of the box) and also an application framework. I’m less excited about it for the application framework part, but mainly because RESTful content is just that exciting (I’m weird, I know).

Sling is still in the Apache incubator and in my opinion lacking in depth of documentation. Also don’t be fooled into thinking this is a Content Management System, its a system interface into your content which you could build a user interface to manage content on top of. I recently have been doing some searching for additional documentation and peaking around in the source and wanted to expose what I think are two real hidden gems that weren’t easy to find (for me): JSON Query Servlet and JQuery JCR Explorer.

JSON Query Servlet

When you first take a look at Sling you see the ability to be able to request and add specific content nodes RESTfully but no way to query for set of content. This was a real let down because I was interested if  Sling could almost eliminate server side code creation in creating a SOFEA style application. After doing some searching though I found a blog ( http://in-the-sling.blogspot.com/2008/09/how-to-use-json-query-servlet.html ) which detailed the JSON Query servlet which I did not find documentation for on the Sling site.

The query servlet allows you to do xpath or sql like queries on the content repository and get the content back in JSON format. An example xpath query URL would be http://localhost:8080/content.query.json?queryType=xpath&statement=//parentcontentnode/node/*

No install necessary its included within version 5 in the incubator. The drawback is without figuring out access controls it makes your content wide open to anyone who knows your using Sling.

JQuery JCR Explorer Bundle

One of the other things that isn’t included with Sling out of the box is the ability to browse the content repository. You could argue whether or not such a piece of functionality should belong to Sling itself, but it definitely be a nice to have. I looked else where for a quality Open Source JCR Repository explorer. From the explorers I found the best one was an Add-on bundle in the Sling SCM repository at  http://svn.apache.org/repos/asf/sling/trunk/contrib/explorers/jquery. Its a pretty simply jQuery based repository browser which has some issues but overall works well.

To install you’ll need to first build the bundle with maven, ‘mvn package’ should do it. Then start Sling and visit the OSGi administrative interface for the container your running Sling in and install the bundle via the normal process. Once you have installed you’ll have a new ‘explorer’ selector available to visit each piece of content in your JCR. To see the root go to http://localhost:8080/.explorer.html

Active Preview 1.2 Released!

April 4, 2010

Normally I wouldn’t get excited about a point release but we found BIG issues which prevented ALL themes from being supported and issues with firefox file uploading and theme previews. All have been fixed. Hope folks upgrade ASAP! Check it out at http://wordpress.org/extend/plugins/active-preview/.

In case you missed the first release post here is the gist of what the wordpress plugin does.

The plugin provides a new preview button which creates a new window which is updated in real-time with your edits to the WordPress editor (tinyMCE or the HTML editor). Check out the video demo! I know its a little rough.

Plugin site is here http://wordpress.org/extend/plugins/active-preview/

Should I use Open-Session-In-View?

April 4, 2010

The Open-Session-In-View pattern is a very popular strategy for managing open hibernate sessions  for the duration of a request in a server side java web application (possibly for other technologies as well). Basically the pattern uses a servlet filter to open a session on each request as soon as its received and then on completion of processing closes the session. Some people love it, some hate it, and some just use it because they don’t know what else to do.  As with anything I think that there are times to use it and times when its not appropriate. I want to provide some guidelines which I have found are helpful in understanding when it is appropriate.

I’ll assume you know the problem this pattern solves so we won’t talk about that (check out this link for discussion on that). But we’ll need to at least discuss the major implication of the open-session-in-view which is a database connection open for an entire request then cover the guidelines which fall out of that.

Database Connection Open for the Entire Request

Because the pattern is opening a hibernate session first thing on each request which implies that a database connection could be open for basically the entire request. For the purpose of this discussion we’ll assume that the open session ALWAYS correlates to an open database connection for the entire request. In the default case with many versions of hibernate, an open session doesn’t open a connection until it is needed but lets assume the worst case scenario. This will ensure that the pattern is match for you in all situations.

Problems with Open Sessions

If you have an open session over your entire request which is going to have an open database connection associated with it, the problem is that you will be tying up a database connection which is a limited resource in any setting for work which doesn’t necessarily involve the connection. Unless your in a high traffic application this generally won’t affect you but later I’ll show where it could even without large traffic volumes.

Why Most People Don’t Mind Open Sessions

Now most people probably don’t care about having an open session for the entire request or a database connection because in today’s relational database driven web world in a majority of cases 90+ percent of your overhead in a request time is database word anyways. So an extra 10% of time for a session or database connection is worth the cost to drive down complexity in your application.

Exceptions to the Trade Off

The  exceptions to this trade off hopefully becomes pretty clear, when you have application logic which is going to consume most of the processing time of a request the trade off for driving down complexity loses to excessive session and connection time. In many cases your talking a minority of requests which fall into this trap so overall the trade off is still worth it you just need a couple workarounds. In a high request volume situation which limited resources for hibernate sessions/database connections your open-session-view could become a liability.

Guidelines for Open-Session-In-View Usage

  1. Use it when it only when it drives your code complexity down! This pattern isn’t for all applications. when you have complex database interaction either involving multiple databases or complex transaction strategies, it may not make sense. Or if you don’t have hibernate lazy loading issues happening its probably not useful either.
  2. Use when the session/database connection usage is a HIGH percentage of a request’s processing time already. When looking at each request which involves hibernate session/database access and check to see if thats the vast majority of processing time in the request. If this doesn’t apply then either think about another pattern or creating an exception case for those types of requests via filter or some other means.
  3. Use when the only threat to request time is database access itself. If there is potential for something to go wrong like a HTTP to get hung because not timeout is set then this about creating an exception for those types of requests or think about another pattern.
  4. Use when a HIGH percentage of ALL requests filtered require session/database connection access. If you have an application which a low percentage of requests which come into it need database interaction AND you cannot distinguish these requests via your filter or some other means, then don’t use this pattern.
  5. The pattern is preferably used on lower request volume applications. Just a preference in case you are trying to squeeze application resources.

Examples of Where to Avoid Usage

  1. File upload requests – most cases this isn’t an issue but the right combination of semi-large file and slow client network connection could tie up an open session/database connection for some time. Potentially timing out the database connection.
  2. Requests which trigger Service calls – specifically I’m thinking of web service calls where if the timeout on the socket then if the service becomes unresponsive then you’l potentially time up an open session/database connection. Again potentially timing out the database connection.

Active Preview Released!

April 3, 2010

Just finished up a new WordPress plugin called Active Preview. The plugin provides a new preview button which creates a new window which is updated in real-time with your edits to the WordPress editor (tinyMCE or the HTML editor). Check out the video demo! I know its a little rough.

Plugin site is here http://wordpress.org/extend/plugins/active-preview/