Netuality

Taming the big bad websites

Archive for the ‘Process’ Category

If programming is like gardening …

leave a comment

… then a software team is like an aquarium.

“Programming is Gardening, not Engineering” says Andy Hunt (of Pragmatic Programmer fame) in one of his well-known Artima conversations.

Inspired by such an interesting ‘organical’ comparison, it’s my metaphor of a software team which behaves quite like an aquarium. I assume not all my blog readers are aquaria hobbists, so let me explain:

  • Permanent monitoring and adjusting. Left alone and unsupervised, an aquarium apparently manages to ‘survive’ by itself. However, subtle changes in water chemistry will slowly start to build up. Interesting fact is that fishes seem to cope well with these changes – until a certain balance is reached and they get sick and eventually die. In my experience, the threshold is rather thin, one day everything seems ok and the next day it’s a major disaster. The effort necessary to clean up the situation is significantly bigger than the effort spared by not taking care of the aquarium. The parallel here is quite obvious : you can’t manage what you can’t measure, you can’t control what you can’t manage. Software metrics, code reviews, frequent releases, testing and feedback, these practices are vital if you want a ‘healthy’ project and a ‘living’ team. Otherwise, beware, the inflexion point might be just a few days away*.
  • However, changes must be done gradually. Supposing that a major shift in water parameters was detected, taking immediate and radical measures will generally worsen the situation (unless the catastrophy is already there). It is highly recommended to distribute the change over a reasonable period of time and generally never try to influence two major water parameters at the same time (Ph and Gh for instance). Explanation: all these parameters are interconnected in intricate ways, by changing one you’ll automatically influence the others. By changing two or more, the outcome is hard to predict and might open the path to a disaster. There’s a nice parallel here. A major change in methodology with sudden introduction of multiple new/modified development practices, will only make the team unstable. Even if, globally speaking, the change is a highly beneficial one. ‘Good things come to those who wait’ … and measure … and change … and wait … and measure … and change …
  • A beautiful aquarium is a visible one. Transparent glass, lights and everything. You wouldn’t feed and keep your fishes if they were living in a black box and you are afraid to look inside it ?

*Of course [and fortunately], the developers do not get sick because of a reeking team/project, they simply leave.

Written by Adrian

November 4th, 2004 at 7:45 pm

Posted in Process

Tagged with ,

Prevent features creep by charging double !

leave a comment

Well, this is the very condensed version of Martin Fowler's latest approach to requirement creep. His idea is : (1) 'start by charging the double thus allowing a comfortable buffer for the project' then (2) 'accept all new requirements without charge, in the limits of the buffer', (3) 'explain and agree with the customer that fixed scope approach was a mirage' and (4) live happily ever after.

At first, the approach seemed fine and I especially liked the 'risk mitigation', which is basically: if at step 3 you enter into a conflictual state with the customer and start rejecting/charging features, then at least you had peace of mind during step 2. Without this little stratagem, you'd be anyway into a conflictual state with the customer from the very beginning of the fixed-price project.

What I have some problem accepting is: point 1, charging the double of the usual rate. In his essay, Martin calmly explains since we have better and more productive people, we can actually do the job for less. So, in order to be able to apply this 'recipy', you have to be at least twice more productive than your competition. This basically places you in the upper region of the Gaussian bell, probably somewhere in the top 10% companies. Well, that's a nice audience ! It's always pleasant to know that you have 90% chances of not being able to apply this advice.

The 'double bill' strategy raises another interesting problem. How would you apply this when the customer is internal ? It is known that internal projects are plagued with massive feature creep (and this is probably one of the main reasons why these projects fail so often). By 'charging' your corporate sponsor (generally, upper management) you're either : asking to double the project length or doubling your team size. Both cases, you're in big trouble.

I guess I'll stick to the old way of doing things: providing good visibility on the project status, estimating each new feature request and letting the corporate pitbulls to do the negotiation dance. That is, until I reach that 10% Nirvana where we are allowed to charge the double.

Written by Adrian

November 2nd, 2004 at 6:06 pm

Posted in Process

Hallowed be thy tablename !

leave a comment

If you haven't had the opportunity to work on a really big project, naming is probably not on your top list of programming best practices. And you are certainly going to regret that when your project grows.

Of course, everybody, including good old Scott, knows that CUST signifies CUSTOMER and DEPT signifies DEPARTMENT. And statistically speaking, the chances for these abreviations to mean something else is very small – as long as your domain model is, also, quite small. But, when the number of classes in the domain is in the hundreds or even in the thousands you'll suddenly find out that CUST may signify CUSTOMS (as in 'Customs Tax'), CUSTOMIZATION or even CUSTARD. I am working right now in the development team of an ERP for agro-food industry and wouldn't be amazed to see such an attribute name. I've seen worst, some details of the implemented business model are a total blasphemy for human logic and common sense.

Anyway, the problem is even worse in these big projects because domain model classes are not written by hand, they are generated. While this is hardly a novelty for you (please don't laugh in the audience), it also means that analysts are composing the datamodel, then classes/mappings/SQL schema/docs are generated, finally programmers will write the business logic and infrastructure integration using the generated artifacts. Names are usually propagated all along the generation chain. And when a programmer finds 'Cust' in the name of an attribute, how does she know it's a 'Customer' and not some 'Custard' ? Especially when the documentation is scarce and the author analyst is in a well-deserved six-months sabbatical in Anctarctica.

Hence, the need for standardization. This is usually done via a dictionary containing the abbreviations and their meaning(s). The rule is very simple : every word in the datamodel must be composed of abbreviations from the dictionary. Some programmers might argue that there is no need for abbreviations and full words are ok – lovely code such as '.getSecondaryBillingAddressForService(currentBill.getBillableServicesList(i).get(currentService)).getStreet().getName()'. This is perfectly understandable, however let's not forget that some databases (Oracle, Sap DB, etc.) have issues with table and column names longer than 32 characters, like for instance refusing to create it in the first place. Which is mildly bothersome if you use a relational database*.

And the golden rules of domain model naming are :

  • Be a pedantic bastard. Don't just throw the dictionary in the wild and tell people 'yeah, pleease follow this standard'. Make automatic checking on every piece of datamodel feeding the code generator. The automatic checking should be done at each save operation if possible. I have implemented this inside an Eclipse plugin used by the project analysts: when hitting save on an entity containing invalid names, a window will immediately pop up and inform about the errors. Don't just display the errors, but completely forbid saving if the entity has naming issues. This will keep the naming absolutely pristine, however the analysts might be tempted to create a lynch mob. Do not give up.
  • Avoid synonyms, plurals, etc. This is a software product, not a grammar contest.
  • Throw some stats on the mail from time to time to tell how well the model is named. People will like that.

My current gig involves, among other interesting stuff, managing the naming tools in the various Java projects that we are developing. Unfortunately, the naming rules were not really enforced (they had no pedantic bastards before me ?), so the domain model is only partially compliant. Hence, I'm in the midst of developing tools for automatic renaming of model and the new code is going to disrupt the activity for a while (thank God for autocompletion features in modern IDE's !). Things would have been much smoother if the naming was enforced from the beginning. I think there is not such thing as 'too late' to put naming in order in a big project. And it'll absolutely be done, because there's very strong managerial support for this kind of tasks (main company shareholder and CEO is a former programmer himself, as well as a quality buff – 'when time permits'™).

Unfortunately, I had to allow some 'non-compliant' islands of code in the modules which are already deployed at customers. But, have no false hopes, sooner or later I'm gonna get that code too. I'm a pedantic bastard, and proud of it.

* Now, if you're using a wanabee storage solution like Prevayler to store gigabytes of business data (or more!), you have much bigger problems than naming. Please stop reading this article and do something about it.

Written by Adrian

June 26th, 2004 at 11:54 pm

Posted in Process

Tagged with , ,

Effective testing of database schema – the missing link

leave a comment

There is a certain contradiction which appears in modern projects concerning the unit testing strategy. On one hand, there is a powerful assertion stating that business logic testing should be completely disconnected from the database. This makes perfect sense in a certain way : the tests should check the business logic, not the database and/or the persistence layer. Then, generally the persistence layer is a fully-fledged product (such as the excellent Hibernate) or other JDO-esque solution which has its own testing suite – no need to check that it really works. Usually, the link between business objects and persistence is “faked” using mock objects. Basically, this means that testing the code doesn't need a running database (well, code testing doesn't need a database at all).

The database schema should also be tested – the only tool I am aware of is the excellent DbUnit. Although more targeted towards data testing, it copes quite well with schema testing. Nicely integrated with Ant, DbUnit is the right solution for your database testing needs. And yes you do need to test your database since it is supposed to evolve along with the code (there's a great article about Evolutionary database design on Martin Fowler's site).

Somehow, we instinctively feel that something is missing from this picture. We are testing the code, disconnected from the database – and also the database, in a independent manner. But how can we be sure that the persistence layer between the application model and the database is ok ? And I'm not talking about the persistence mechanics, but the data model itself. Basically, this goes down to mapping testing. I am aware of the fact that some special O/R bindings do not need mappings and there is a direct object-table correspondence, but I feel that this is generally a BadIdeaTM since it hampers the flexibility of both the application model structure and database schema.

In the small-to-medium-sized projects I've been working lately we didn't feel the need of mapping testing. This has a very simple reason : the person which is performing the change on the database schema is usually the same person which needs a certain modification in the application model. After performing the modification, quite often this same person starts the application and makes a functional test which implicitly checks the mapping. Most of the time this works just fine.

However, some nasty problems might appear when the project starts to grow :

  • changing the mapping is more difficult, some kind of testing might give indications about the nature of the problem.
  • there is a certain “schema decay” when some foreign keys cannot be created at a certain point, then their creation is forgotten when the data is finally consistent. Further with schema evolution, more and more objectual data model relations will not be backed up by integrity constraints.
  • you may sometimes end up with unmapped and unused tables/views/columns.

A really useful testing tool should be able to check one or multiple mapping files against a database schema (via DbUnit, why not). The tool should :

  • a) recognize different mapping formats (Hibernate, Castor, etc.) and different database types
  • b) match the mapping declarations with the tables from the database, check their existence also the type of primitive columns
  • c) warn if some constraints are wrong or missing (based on simple aggregation, cardinality or other hints from the mapping structure).
  • d) warn for unmapped tables/views/columns.

Here's the good news : a tool which is able to perform a) and b) does exist ! And the bad news (purists will jump with disgust) : just for a moment, you should forgot about testing your code without the database. The solution is quite simple, build a unit test which fires up the persistence layer and retrieves at least one of each type of mapped object from a test database. If no exceptions are encountered, the test is ok. This is a basic but effective approach and :

  • be prepared to have a testing database different from the development database but with schema automatically synchronized.
  • harden your test case by inserting the most “exotic” test data you can find. If the data goes in via SQL (dbunit) but you have problems retrieving it via persistence layer, then look for missing schema constraints and sometimes some subtle mapping problems.

You could go one step further by performing update and deletion operations and check them via dbunit, but we have found that if the retrieval works, the persistence layer is perfectly able to perform updates and deletions. Now if your data layer is more complex, then just use some mock objects to test it – because it's a code issue and not a mapping issue.

If you are interested in the topic, just let me know by mail (still waiting for comment integration with FreeRoller). And yes, I'm still looking for a tool able to do a), b), c) and d).

Note : There is a simple technique that we are using currently. The idea is that, when the application starts, a simple retrieval is performed via the persistence layer for some objects that we know for sure must exist in all test and production datbases. It this succeeds you may be sure of two things : that the database connection really works and that the mapping is probably fine. This way, you don't have to wait the first persistence operation in order to see an error. Coupled with a nightly build and rerun, this little trick proved quite effective at keeping the mapping clean.

Written by Adrian

March 1st, 2004 at 5:21 pm

Posted in Process

Tagged with

Hibernate DOES scale

leave a comment


I'm being part of the development team for migration of a mainframe ERP to Java technology. A preliminary version of the app shows no less than 878 tables (see image). It could be much more, knowing that quite a lot of “lists of values” – which are not supposed to change frequently during the application's lifecycle – were transformed into POJOs containing – well – lists of values. Even on this unfinished version, it was nice to see how Hibernate copes with the level of complexity.

Well, it does scale rather nice. The SessionFactory creation takes about 40 seconds (P4/2.4G/512MB) which is very very decent in this context. The necessary memory for all that metadata and stored statements looks loke a measly <10MB (altough it's hard to tell, being deeply buried in the entrails of Tomcat - no profiling yet, it's a bit early to invest time in that). And what's more important : our custom app for browsing the database looks as snappier as ever. Next step - when time permits - will be to grind the app a little bit to see if there are any concurrency issues. But so far so good – Hibernate sure looks like a real winner.

Written by Adrian

November 17th, 2003 at 11:47 pm

Posted in Process

Tagged with ,

Test infected come spread the disease

leave a comment

At www.opensourcetesting.org

Written by Adrian

July 8th, 2003 at 12:19 am

Posted in Process