Netuality

Taming the big bad websites

Archive for March, 2004

Getting rid of the dreaded o.e.c.i.r.AssertionFailedException

leave a comment

It you have ever written even the smallest standalone app using SWT and Jface library, then you must know (and hate!) org.eclipse.core.internal.runtime.AssertionFailedException. This exception has the very bad habit of substituting the initial one without keeping any trace of it. The official explanation is the following :

“Jface is not guaranteed to work in standalone mode. As a matter of fact, Jface tries to report different errors using the Eclipse plugin mechanism, thus provoking a fatal assertion failure which is the only exception being reported.”

Ryan Lowe suggests wrapping [the code] in a try/catch block temporarily. Yeah sure this might just work if you know WHAT to wrap. What do you do when, after multiple commits and updates of the same source file modified by different people, you have the failed assertion ? Wrap in a try/catch every slice of code committed in the last few hours ? Hmmm … don't think so.

An interesting solution would be (as proposed by Ryan) to use aspects in order to wrap automatically each and every SWT call. But this might bring serious performance issues and I also have some doubts that it will catch everything.

But boy, aren't we lucky that Eclipse is opensourced ? Just by looking at the stack trace (which is the same over and over no matter where your real error is) you can spot the guilty : org.eclipse.core.internal.runtime.InternalPlatform (found in runtime.jar/runtimesrc.zip). The little bugger is Mr. private static boolean initialized. Putting it on true involves starting the platform, which is a quite complex process judging by the loaderStartup(URL[] pluginPath, String locationString, Properties bootOptions, String[] args, Runnable handler) throws CoreException method. Unfortunately, being a private variable means that reflection won't work and dynamic proxying won't work, either*. AKA Out Of Ideas (TM)

So, it's time to stop trying to be smart. I'll just be a rude dumb programmer and simply modify the source, recompile and replace the .class files in the runtime.jar. It's extremely simple yet sharply efficient:

  • initialize “initialized” on true
  • in the method private static void handleException(ISafeRunnable code, Throwable e) replace all the content with the following line : e.printStackTrace();code.handleException(e);. A stack in the console is enough for me at this stage, but of course you may use your favorite logging infrastructure classes to report the exception. The code.handleException call has (at least in my app) the effect of displaying an innocuous dialog box telling the user that something went wrong. Just the right dose of details aka nothing at all (don't scare the poor user, please).

Well, what's more to say ? This stupid trick simply works.

On second thought, I might use RCP. Maybe, in a future episode.

*AFAIK Please correct me if I'm mistaking here.

PS That was quick “Setting the accessible flag in a reflected object of java.lang.reflect.AccessibleObject class allows reading/modifying a private variable that normally wouldn't have permission to. However, the access check can only be supressed if approved by installed SecurityManager.”. Humm, ok, so there IS a smarter solution after all. Left as an exercise for the reader :)

Written by Adrian

March 25th, 2004 at 8:23 pm

Posted in AndEverythingElse

Tagged with ,

Scripting languages not just for toys (a Ruby web framework)

leave a comment

According to David Heinemeier Hansson, the RAILS 1KLOC (!) web framework written in Ruby was used to develop Basecamp, a mildly complex project management webapp. What's fascinating is the Ruby code used to develop Basecamp is only 4KLOC (according to the RAILS document) and was developed in less than 2man*month (including 212 test cases and all the bells and gingles). Although I strongly suspect that there was a lot of templating going on behind the hoods and I doubt that template development time was included, the efficiency is amazing, especially if you consider that they have started almost from scratch. Makes you wonder what would be possible to perform in Ruby with a comprehensive library such as PHP's PEAR (well, duh, ignoring the fact that it should be called REAR which is a very very very nasty name) ?

Written by Adrian

March 24th, 2004 at 2:07 pm

Posted in Tools

Tagged with , ,

Book review – Tapestry In Action

leave a comment

My first contact with Tapestry was more than 18 months ago. Back then, I was interested to find a web framework for integration with our custom Avalon-based (using the now-obsoleted) Phoenix server*. The web interface was ment to be backoffice stuff, for simple administration tasks as well as statistics reports. Given that the data access and bussines logic were already developed, we were looking for something simple to plug into a no-frills servlet container such as Jetty. we managed very easily to integrate Jetty as a Phoenix service and pass data through the engine context. But when we finally integrated Tapestry [into Jetty [inside Phoenix]] and make it display some aggregated statistics, the project funding was cut and the startup went south. But, that’s another story and rather uninteresting one.

Meanwhile, things have changed a bit. Tapestry had become a firsthand Apache Jakarta project, the Tapestry users list is more and more crowded, and again I see it used in my day work (by Teodor Danciu, one of my coworkers and incidentally author of Jasper Reports) and doing some moonlighting by myself for an older web project idea. And there is exceptional Eclipse support via Spindle plugin. While the ‘buzzword impact’ on Tapestry on a Java developer CV doesn’t yet measure up with Struts, this framework has obviously gained a lot of attention lately.

So, what’s so special about it ? If I’d have to choose only one small phrase I’d quote Howard Lewis Ship, Tapestry lead developer, from the preface of his book ‘Tapestry in Action’:

The central goal of Tapestry is to make the easiest choice the correct choice.

In my opinion this is the weight conceptual center of the framework. Everything, from the template system which has only the bare minimum scripting power, passing through the componentized model, up to the precise detailed error-reporting (quite unique feature in the opensource frameworks world) gently pushes you (the developer) to Do The Right Thing. To: put logic where it belongs (classes not templates), organize repetitive code in components, ignore the HTTP plumbing and use a real, consistent, MVC model in your apps (forms are readable and writable components of your pages). You don’t need to be Harry Tuttle to make a good Tapestry webapp, just a decent Java developer is enough. That’s more than I can tell about Struts …

Coming from a classic JSP-based webapp world, Tapestry is really a culture shock. The most appropriate way to visualise the difference is to imagine a pure C programmer abruptly passing to C++, into the objects world**. For a while, he will try to emulate the ‘old’ way of work, but soon enough he’ll give up and start coding his own classes. However, this C programmer will have to make some serious efforts, not necessarily because OOP is hard to leard, but in order to break his/her old habits.

“Tapestry in Action” is your exit route from the ugly world of HTTP stateless pages and spaghetti HTML intertwingled with Java code and various macros. It’ one of the best JSP detoxification pills available on the market right now.

The first part of the book (‘Using basic Tapestry components’) is nothing to brag about. It’s basically an updated and nicely organized version of the various tutorials already available via the Tapestry site, excepting probably some sections in chapter 5 (‘Form input validation’). By the way, the chapter 5 is freely downloadable on the Manning site and is a perfect read if you want a glimpse of the fundamental differences between Tapestry and a classic web framework (form validation being an essential part of any dynamic site). However, if you want to go over the ‘Hangman’*** phase you really need to dig into the next two book sections.

The second section ‘Creating Tapestry components’ is less covered by the documentation and tutorials. I’m specifically pointing here to the subsections ‘Tapestry under the hood’ (juicy details about pages and components lifecycle) and ‘Advanced techniques’ (there’s even an ‘Integrating with JSP’ chapter !). While it is true that any point from this chapter will generally be revealed by a search on Tapestry user list or (if you’re patient) by a kind soul answering your question on this same list, it’s nethertheless a good thing to have all the answers nicely organized on the corner of your desk.

The third and last chapter (‘Building complete Tapestry application’) is a complete novelty for Tapestry fans. It’s basically a thorough description of how to build a web application (a ‘virtual library’) from scratch using Tapestry. While the Jboss-EJB combination chosen by the author is not exactly my cup of tea (I’m rather into the Jetty+Picocontainer+Hibernate stuff) I can understand the strong appeal that it is suppposed to have among the J2EE jocks. Anyway, given the componentized nature of Tapestry, I should be able to migrate it relatively easily if I feel the need for it. The example app is contained in a hefty 1Meg downloadable archive of sources, build and deployment scripts included.

To conclude, ‘Tapestry in Action’ is a great book about how to change the way you are developing web applications. The steep learning courve is a little price to pay for a two or three-fold improvement in overall productivity. And this book should get you started really quick.

*Which AFAIK is still used at our former customer.
**There were some posts on Tapestry user list on about a certain conceptual ressemblance with Apple’s WebObjects. I can’t really pronounce upon this because I do not know WebObjects, but the name in itself is an interesting clue.
***’Hangman’ is the Tapestry ‘Petshop’ (although there is also a ‘real’ Tapestry Petshop referenced in the Wiki).

Written by Adrian

March 18th, 2004 at 2:56 pm

Posted in Books

Tagged with , , , , ,

Eclipse plugins and Groovy : when binary compatibility is not enough

leave a comment

One of my current responsibilities is to maintain an internally developed plugin, used by various members of the team to generate code from the analysis model. As far as I can tell by the webstats of the update site, every version is downloaded by 18 people, a small but heterogeneous user base.

My biggest problem is the Eclipse version. The analysts are not exactly Java geeks waiting anxiously for nightly builds of Eclipse, they use a 'standard' 2.1.2, mainly because it's stable and well internationalized. Things go wilder in the programmers team : versions ranging from conservative (2.1.x) to liberal (3.0M4) and even the occasional dumbass with the latest integration build (that would be me, of course).

The 'enhanced binary compatibility' in 3.0M7 came as a relief, diminishing the need to switch between Eclipse versions in order to develop the plugin or work on other tasks. Well, I still have to briefly test the damn thing on Eclipse 2.1.x before releases. However, running simultaneously two or three Eclipse instances is no piece of cake for my 512Mb laptop (I still haven't found who I have to kill here in order to be awarded a memory upgrade). Unfortunately, checking out the plugin source into M7 has shown the invisible ugly face of 'binary compatibility': the plugin doesn't compile.

There are just a handful of lines of code, some emphasizing differences in Eclipse API which are somehow hidden in 'compatibility mode', some effectively showing small bugs in plugin behavior. But the real issue here is that I cannot really develop the plugin in M7 until I manage somehow to compile it, while not losing downwards compatibility.

Let's dissect one of the compilation issues. The bummer concerns automatic opening of an editor (or focus if already opened) when clicking on its reference (somehow similar to what happens when you Ctrl+click on a class name in JDT). In the older API it was a question of page.openEditor(file); where page is a IWorkbenchPage and file is an IFile. This simple stuff worked well until 3.0M4, then (M5) things changed to page.openEditor(new org.eclipse.ui.part.FileEditorInput(file),editorId); where FileEditorInput implements (among others) an IEditorInput. While this is certainly nice because you may directly link editors to something other than files***, obviously the old code does not compile under M7.

Maintaining different projects for 'old' and 'new' style projects for 10 or so lines of code is obviously overkill. Second solution – via reflection, but it would mean more than few lines of code and the result would not exactly be comprehensible nor maintainable. Only thing left : use a scripting language.

Of course I could have taken any decent scripting language embedded in Java. Decision to go with Groovy was taken mainly because of its coolness factor, but I am sure the idea will apply easily with Jython (big favorite of mine) or the performance-aware Pnuts, for instance.

In a nutshell, you have to execute a line of code depending of the current Eclipse version (it's a little bit trickier, but we'll discuss later about it).

groovy.lang.Binding binding = new Binding();
binding.setVariable("page", page);
binding.setVariable("file", file);
groovy.lang.GroovyShell groovyShell = new GroovyShell(getClass().getClassLoader(), binding);
if (newPlatform)
{
return groovyShell.evaluate("page.openEditor(new org.eclipse.ui.part.FileEditorInput(file),editorId);", someExpressionId);
}
else
{
return groovyShell.evaluate("page.openEditor(file);", someExpressionId);
}

It's basically a vanilla flavored ripoff of the Groovy embedding example from the docs. The boring part : caching the binding, hiding everything behind a nice facade, is left as an exercise for the [interested] reader. Remember to pass the classLoader of the current class, do not create a GroovyClassLoader out of nowhere or you'll end up dealing with Eclipse own class loader, which means trouble for simple tasks like these.

How do we know that the Eclipse version is the 'new' or the 'old' one is not that simple because remember : 'old' means 2.x up to 3.0M4. So finding out Eclipse SDK version is not enough, you have to find out another discriminant which in our case is the 'org.eclipse.ui.ide' plugin. Result:

boolean newPlatform;
//find out if we are inside a new or an old platform
PluginVersionIdentifier pvi = Platform.getPluginRegistry().getPluginDescriptor("org.eclipse.platform").getVersionIdentifier();
newPlatform = pvi.getMajorComponent() >= 3 && Platform.getPluginRegistry().getPluginDescriptor("org.eclipse.ui.ide") != null;

No, we are not ready to deploy yet. A small trick has to be performed or the plugin won't start under older versions of Eclipse. We had to add some new plugins to dependencies (in the pligin descriptor) such as the aforementioned 'org.eclipse.ui.ide', obviously the older versions of Eclipse will not find it, hence block our plugin activation on startup. In order to overcome this, you have to add (by hand !) a lesser known attribute ('optional') in the corresponding tag from the plugin.xml file : <import plugin=”org.eclipse.ui.ide”/> becomes <import plugin=”org.eclipse.ui.ide” optional=”true”/>. Now, the plugin is ready to be deployed.

For those brave enough to dare distributing such a plugin via an update site remember to 'cheat' by not allowing new plugins such as 'org.eclipse.ui.ide' in the feature.xml file (again, delete by hand). The 'optional' attribute doesn't help in this case. Go figure…

I hope that some of you will find useful this recipe for maintaining compatibility between different Eclipse versions with minimum of fuss. However, please note the specific prerequisites for this type of solution :

  • there are only simple 'few-lines' modifications
  • the code is not expected to evolve a lot in the 'affected' areas
  • the evaluated code is not in a performance-sensitive area

***Interesting enough, this was one of the reasons I recommended against adopting RCP in one of our apps, a few weeks ago. It's nice to see that – now – the mechanism linking editors and resources is MUCH more flexible. Anyway, this won't probably change the decision of not using RCP because the main issue it's the volume of code we have to change. Development of one of the app modules started almost a year ago and the animal it's already sold and deployed on different production sites: upgrading would be a real nightmare. Maintaining a fork of the app is not an option either. Well, I guess we'll just have to cope with 'plain old' Jface and SWT.

PS After some days of 'silence', I have noticed from the logs that most popular posts on my blog are those concerning Eclipse plugins and Manning books (I seem to have a nice Google ranking on these topics). So, expect more of these (I am reading the MEAP of 'Tapestry in action' – a review should be up shortly).

Written by Adrian

March 1st, 2004 at 6:36 pm

Posted in Tools

Tagged with , , , ,

Eclipse 2.1 workspace deadlock – and a dirty but small workaround

leave a comment

It happened also on older versions but it does happen more frequently on the “final” 2.1 version. FYI : Gentoo Linux, Eclipse gtk, seems to be related somehow with bug 33138 (don't have the time to dig further).
Sometimes the monster simply hangs during a [take your pick : refactoring, new class generation] with an empty progress bar in the dialog box and a completely useless “Cancel” button. Been there, done that : kill -9 …
Then, trying to re-start Eclipse leads to a deadlock – while recovering workspace : dialog box, empty progress bar and useless “Cancel”. Freezed !
I have a lot of settings and projects so deleting the whole .metadata directory is just too painful. Therefore, I had to find out a smaller workaround : just delete the file .metadata/.plugins/org.eclipse.ui.workbench/workbench.xml and Eclipse restarts with a clean workspace. Some adjustments are lost but hey – my metadata is still there.

Written by Adrian

March 1st, 2004 at 5:21 pm

Posted in Tools

Tagged with ,

Effective testing of database schema – the missing link

leave a comment

There is a certain contradiction which appears in modern projects concerning the unit testing strategy. On one hand, there is a powerful assertion stating that business logic testing should be completely disconnected from the database. This makes perfect sense in a certain way : the tests should check the business logic, not the database and/or the persistence layer. Then, generally the persistence layer is a fully-fledged product (such as the excellent Hibernate) or other JDO-esque solution which has its own testing suite – no need to check that it really works. Usually, the link between business objects and persistence is “faked” using mock objects. Basically, this means that testing the code doesn't need a running database (well, code testing doesn't need a database at all).

The database schema should also be tested – the only tool I am aware of is the excellent DbUnit. Although more targeted towards data testing, it copes quite well with schema testing. Nicely integrated with Ant, DbUnit is the right solution for your database testing needs. And yes you do need to test your database since it is supposed to evolve along with the code (there's a great article about Evolutionary database design on Martin Fowler's site).

Somehow, we instinctively feel that something is missing from this picture. We are testing the code, disconnected from the database – and also the database, in a independent manner. But how can we be sure that the persistence layer between the application model and the database is ok ? And I'm not talking about the persistence mechanics, but the data model itself. Basically, this goes down to mapping testing. I am aware of the fact that some special O/R bindings do not need mappings and there is a direct object-table correspondence, but I feel that this is generally a BadIdeaTM since it hampers the flexibility of both the application model structure and database schema.

In the small-to-medium-sized projects I've been working lately we didn't feel the need of mapping testing. This has a very simple reason : the person which is performing the change on the database schema is usually the same person which needs a certain modification in the application model. After performing the modification, quite often this same person starts the application and makes a functional test which implicitly checks the mapping. Most of the time this works just fine.

However, some nasty problems might appear when the project starts to grow :

  • changing the mapping is more difficult, some kind of testing might give indications about the nature of the problem.
  • there is a certain “schema decay” when some foreign keys cannot be created at a certain point, then their creation is forgotten when the data is finally consistent. Further with schema evolution, more and more objectual data model relations will not be backed up by integrity constraints.
  • you may sometimes end up with unmapped and unused tables/views/columns.

A really useful testing tool should be able to check one or multiple mapping files against a database schema (via DbUnit, why not). The tool should :

  • a) recognize different mapping formats (Hibernate, Castor, etc.) and different database types
  • b) match the mapping declarations with the tables from the database, check their existence also the type of primitive columns
  • c) warn if some constraints are wrong or missing (based on simple aggregation, cardinality or other hints from the mapping structure).
  • d) warn for unmapped tables/views/columns.

Here's the good news : a tool which is able to perform a) and b) does exist ! And the bad news (purists will jump with disgust) : just for a moment, you should forgot about testing your code without the database. The solution is quite simple, build a unit test which fires up the persistence layer and retrieves at least one of each type of mapped object from a test database. If no exceptions are encountered, the test is ok. This is a basic but effective approach and :

  • be prepared to have a testing database different from the development database but with schema automatically synchronized.
  • harden your test case by inserting the most “exotic” test data you can find. If the data goes in via SQL (dbunit) but you have problems retrieving it via persistence layer, then look for missing schema constraints and sometimes some subtle mapping problems.

You could go one step further by performing update and deletion operations and check them via dbunit, but we have found that if the retrieval works, the persistence layer is perfectly able to perform updates and deletions. Now if your data layer is more complex, then just use some mock objects to test it – because it's a code issue and not a mapping issue.

If you are interested in the topic, just let me know by mail (still waiting for comment integration with FreeRoller). And yes, I'm still looking for a tool able to do a), b), c) and d).

Note : There is a simple technique that we are using currently. The idea is that, when the application starts, a simple retrieval is performed via the persistence layer for some objects that we know for sure must exist in all test and production datbases. It this succeeds you may be sure of two things : that the database connection really works and that the mapping is probably fine. This way, you don't have to wait the first persistence operation in order to see an error. Coupled with a nightly build and rerun, this little trick proved quite effective at keeping the mapping clean.

Written by Adrian

March 1st, 2004 at 5:21 pm

Posted in Process

Tagged with

Using jython to internationalize a PHP app

leave a comment

At first, this might seem a mind-boggling combination. What do
jython and PHP have in common (excepting the fact that I am a Python fan
and my current consulting task is in a PHP project) ?
Well, internationalizing a PHP app is pretty much a trivial task.
If you are a sensible PHP programmer insisting to use PEAR instead of randomly choosing a script from the tons of snippets
populating the “scripting websites”, I18N is probably the
safest choice.
Maybe – for you – application maintainability and performance are not exactly important concerns.
For me, they are. This is why I chose to store internationalized texts in files rather than database.
I'd rather keep the database for real data, which is created, modified, aggregated and such.
And I'd rather like to have an internationalized error message on the screen even if the database is down.
Now we know that we'll use I18N and text will be kept in some php files. However, I am no professional translator and
have no desire to translate or to manually maintain the correspondence between translators files and PHP files
(no, translators won't modify PHP code, stop this nonsense right away).
Code generation comes immediately in mind.
Basically, my first idea was to investigate wether the files used by the translators can be quickly transformed to PHP,
and if I am able to generate their formats from my own files (aka. “roundtrip internationalization process” ?).
Unfortunately, this is not an easy task – as the only clue was that the translators use Office tools such as Word or Excel, because they
rely upon some specialized translation software integrated with these products.
The easy choice is Excel, since it allows a better organization of data than having to search for tables in a Word document.
The hard choice is the tool that I'd use for automatically reading and even generating Excel files.

The difficulty comes from the fact I don't have Windows with Office installed on my desktop, just Gentoo Linux and OpenOffice.
Thus, I am unable to write a simple Python script which could perform my generation tasks via automation.
Fortunately, this is not the first time I am confronted with the issue.
I happen to know that there is a very nice Java tool that I wholeheartedly
recommend for your Excel processing needs :
JExcelApi.

Still, Java is a heavyweight programming language – it would be a really bad idea to fire up the

monster just for some easy processing of Excel files.
Here's why Jython comes naturally into equation. Four hours and about
100 lines of debugged code later, here I am sitting on top of a perfectly functional internationalization tool which :

  • generates PHP code from a big xls file (the root vocabulary) which centralizes all the internationalization texts
  • generates 2-language xls files for translators usage
  • updates the root vocabulary starting from the files modified by the translators
  • Automation scripts are already in cron and there's also a nice text document explaining translators where to get
    their files and where to put them after modification. The resulting script is not exactly fast, but this is tooling
    and not production so this should not be a problem after all.

    Whatever your project contraints are, give Jython a try and you'll be amazed … As they put it on the
    Useless Python site – If it were any simpler, it would be illegal.
    Finally there's a trick not quite related with Jyhon, nevertheless interesting.
    There is an easy way of solving the problem of translating phrases with real data inside them, with easy parameter swapping.
    We'll use the good old sprintf but not directly. We'll pass through a not so popular but extremely useful function,
    call_user_func_array. Suppose that our example needs the
    user name and authorization profile description to display inside a nice message. All you have to do is to define placeholders
    in I18N files which would fit as the first argument for sprintf. The following example should make it clearer:

    localization/en/login.php
    $messages = array(
    'loggedin'=>'You are authenticated successfully as user %1$s with profile %2$s.'
    );
    $this->set($messages);
    
    localization/fr/login.php
    $messages = array(
    'loggedin'=>'Vous avez le profile %2$s en tant qu'utilisateur %1$s.'
    );
    $this->set($messages);
    
    Simple passing of multiple parameters to I18N in PHP. Example function without error processing or data domain checking.
    #this is the multiple parameter function
    function complexTranslation($i18n, $label, $params)
    {
      return call_user_func_array('sprintf',array_merge(array($i18n->_($label)),$params));
    }
    
    Then, you have to initialize your I18N object. This can be done in a generic manner for all pages.
    #specific I18N initialization stuff
    require_once 'I18N/Messages/File.php';
    $g_language_dir = dirname($_SERVER['PATH_TRANSLATED']).'/localization/';
    $i18n =& new I18N_Messages_File($g_langCode,$script_name,$g_language_dir);
    
    Finally, use the function.
    #translate the successfull login message
    $loginbox = Tools::complexTranslation($i18n,'loggedin',array($operator->name,$profile->description));
    

    Written by Adrian

    March 1st, 2004 at 5:20 pm

    Posted in Tools

    Tagged with , , , , , ,

    Ant goodies : extracting info from Eclipse .classpath

    leave a comment

    IMPORTANT UPDATE: Please note that 'antclipse' is now part of the ant-contrib at Sourceforge, under Apache licence.

    Original blogpost:

    I hate duplicating information manually – besides, it's a known fact that duplication is classic code smell that tells you to refactor. This time it's not Java code, but something somewhat different : Ant used in Eclipse context. The issue here is that .classpath files generated by Eclipse have important information which is usually duplicated by hand in the build.xml script. SO many times I've changed libraries in my project in Eclipse just to discover that Ant task was broken…

    There surely are some workarounds like the task written by Tom Davies but unfortunately:

    • It's an Eclipse plugin. I want to be able to build my project standalone, we don't need no stinkin' plugin.
    • I's rather old and with a Nazi style checking of tags so it pukes on my 3.0M3 complaining about a certain attribute of type “con” in the .classpath file (lesson learned: don't be picky about tags and attributes names, if you want the plugin to work with future versions of the software which produced the XML document, especially when you do not have a schema or DTD to rely on)
    • It's Friday evening, dark weather outside, I'm alone in the house and the TV is broken (and even if it worked, there's nothing to see on TV anyway). Boys and girls, let's write an Ant task !

    From the documentation, it appears that writing an Ant task should be an easy task :) . And yes, it is, once you go past all the little idiosyncracies. Like mandatory “to” string in a RegexpPatternMapper, although all you want to do is matching, not replacing. Like having completely different mechanisms for Path and FileSet (I've always thought a Path is a “dumbed down” FileSet, but I was completely wrong, a fileset is somewhat “smarter” but it only has a single directory).

    The result is here, and everything you have to do is to download and put the antclipse.jar (7kB) in your ant/lib library and you're set (just remember to refresh Ant classpath if you're launching Ant from Eclipse).

    What does it do ? Well, it creates classpaths or filesets based on your current .classpath file generated by Eclipse, according to the following parameters :

    Attribute Description Required
    produce This parameter tells the task wether to produce a “classpath” or a “fileset” (multiple filesets, as a matter of fact). Yes
    idcontainer The refid which will serve to identify the deliverables. When multiple filesets are produces, their refid is a concatenation between this value and something else (usually obtained from a path). Default “antclipse” No
    includelibs Boolean, whether to include or not the project libraries. Default is true. No
    includesource Boolean, whether to include or not the project source directories. Default is false. No
    includeoutput Boolean, whether to include or not the project output directories. Default is false. No
    verbose Boolean, telling the app to throw some info during each step. Default is false. No
    includes A regexp for files to include. It is taken into account only when producing a classpath, doesn't work on source or output files. It is a real regexp, not a “*” expression. No
    excludes A regexp for files to exclude. It is taken into account only when producing a classpath, doesn't work on source or output files. It is a real regexp, not a “*” expression. No

    Classpath creation is simple, it just produces a classpath that you can subsequently retrieve by its refid. The filesets are a little trickier, because the task is producing a fileset per directory in the case of sources and another separate fileset for the output file. Which is not necessarily bad, since the content of each directory usually serves a different purpose. Now, in order to avoit conflicting refids each fileset has a name composed by the idcontainer, followed by a dash and postfixed by the path. Supposing that your output path is bin/classes and the idcontainer is default, the task will create a fileset with refid antclipse-bin/classes. The fileset will include all the files contained in your output directory, but without the trailing path bin/classes (as you usually strip it when creating the distribution jar). If you have two source directories, called src and test, you'll be provided with two filesets, with refids like antclipse-src and antclipse-test.

    However, you don't have to code manually the path since some properties are created as a “byproduct” each time you execute the task. Their name is idref postfixed by “outpath” and “srcpath” (in the case of the source, you'll find the location of the first source directory).

    A pretty self-explanatory Ant script follows (“xml” is a forbidden file type on jroller, so just copy paste it into your favourite text editor). Note that nothing is hardcoded, it's an adaptable Ant script which should work in any Eclipse project.

    Created with Colorer-take5 Library. Type 'ant'
    <?xml version="1.0"?>
    <project default="compile" name="test" basedir="."> <taskdef name="antclipse" classname="fr.infologic.antclipse.ClassPathTask"/>
    <target name="make.fs.output">
    <!-- creates a fileset including all the files from the output directory, called ecl1-bin if your binary directory is bin/ -->
    <antclipse produce="fileset" idcontainer="ecl1" includeoutput="true" includesource="false"
    includelibs="false" verbose="true"/> </target>

    <target name="make.fs.sources">
    <!-- creates a fileset for each source directory, called ecl2-*source-dir-name*/ -->
    <antclipse produce="fileset" idcontainer="ecl2" includeoutput="false" includesource="true" includelibs="false" verbose="true"/>
    </target>

    <target name="make.fs.libs">
    <!-- creates a fileset sontaining all your project libs called ecl3/ -->
    <antclipse produce="fileset" idcontainer="ecl3" verbose="true"/>
    </target>

    <target name="make.cp">
    <!-- creates a fileset sontaining all your project libs called ecl3/ -->
    <antclipse produce="classpath" idcontainer="eclp" verbose="true" includeoutput="true"/>
    </target>

    <target name="compile" depends="make.fs.libs, make.fs.output, make.fs.sources, make.cp">
    <echo message="The output path is ${ecl1outpath}"/>
    <echo message="The source path is ${ecl2srcpath}"/>
    <!-- makes a jar file with the content of the output directory -->
    <zip destfile="out.jar"><fileset refid="ecl1-${ecl1outpath}"/></zip> <!-- makes a zip file with all your sources (supposing you have only source directory) -->
    <zip destfile="src.zip"><fileset refid="ecl2-${ecl2srcpath}"/></zip> <!-- makes a big zip file with all your project libraries -->
    <zip destfile="libs.zip"><fileset refid="ecl3"/></zip>
    <!-- imports the classpath into a property then echoes the property --> <property name="cpcontent" refid="eclp"/>
    <echo>The newly created classpath is ${cpcontent}</echo>
    </target>
    </project>

    TODOS : make “includes” and “excludes” to work on the source and output filesets, find an elegant solution to this multiple fileset/directories issues, and most important make it work with files referenced in other projects.

    I am aware that the task is very far from being perfect, so just download it if you're interested, try to use it, try to break it, and tell me what you think and how it can be improved. Also, if you're interested in the source, just send me an email, but be aware that it's Friday evening beer-induced source code, nothing to be proud of… It was only tested it with Ant 1.5.x so YMMV. I assume no responsibility if you use it a production environment.

    Written by Adrian

    March 1st, 2004 at 5:17 pm

    Posted in Tools

    Tagged with , , ,

    A smoother, gentler hibernation

    leave a comment

    Last week, while optimizing a Java app, we have stumbled upon an interesting trick.
    Well, I suppose it's interesting since I haven't been able to find out any trace of it in Hibernate docs or FAQ.
    So, you are using Hibernate for O/R persistence layer of your latest Java app. Welcome to the club.
    Suppose that your app is distributed or your business logic is on multiple servers for performance reasons. In other words, your database is very frequently accessed from multiple points. Thus, each and every display of a projection of your data (like, an innocent “patients list” screen) has to perform a data retrieval operation, aka SELECT. You are not yet at the point of giving up realtime functionality for performance reasons (via complex caching), but however your queries seem pretty slow. And still, you are using probably the faster O/R mapping tool alive.
    Easiest path to data persistence passes through Hibernate Transaction API. And your transaction looks like that (c/p from the docs) :

    [...]
    Session s = sessions.openSession();
    Transaction tx = null;
    try
    {
    tx = sessions.beginTransaction();
    fooList = s.find(
    "select yummy from Big where complex");
    tx.commit();
    }
    catch (Exception e)
    {
    if (tx!=null) tx.rollback();
    s.close();
    throw e;
    }
    return fooList; //or something similar, which goes to the view, via controller
    [...]


    And here's the trick, if you do only SELECTs, there is no point in commiting the transaction. Because (and I'll quote again the manual):

    “Flushing the session
    If you happen to be using the Transaction API, you don't need to worry about this step. It will be performed implicitly when the transaction is committed.”


    Even if you know for sure that you haven't modified your data, the API still has to check for modifications ! And when data is pretty complex, this might take a pretty long time. Therefore, the following approach :

    [...]
    Session s = sessions.openSession();
    Transaction tx = null;
    try
    {
        tx = sessions.beginTransaction();
        fooList = s.find(
        "select yummy from Big where complex");
        tx.rollback();
    }
    catch (Exception e)
    {
        if (tx!=null) tx.rollback();
        s.close();
        throw e;
    }
    return fooList; //or something similar, which goes to the view, via controller
    [...]


    shoudn't change anything in application behavior, all for a “transaction time” divided by 3.
    Hey, that is great… The app runs sensibly faster – for one line of code.

    Feedback: We do have now a “documentation-compliant” solution thanks to Viktor Szathmary, who suggested a session.setFlushMode(FlushMode.NEVER) for the specific session. We haven't yet profiled this, but. However, we have a problem due to the fact that each session is used a few times before being thrown out. No, we do not pool Hibernate sessions but there's a fair bit of reuse sometimes, behind the business logic. Depending on the type of transaction implied, the setFlushMode should change (or not). Ok I have to admit it's a legit idea, but it's a supplementary line of code. And where's the fun ? :)

    Written by Adrian

    March 1st, 2004 at 5:15 pm

    Posted in Tools

    Tagged with ,

    Real life Jakarta Velocity – simple optimizations

    leave a comment

    Here's another nice “real life” story I'd like to share, and this time is about Velocity. Nothing nasty, just a quick check of if and how well Velocity caching is performing, why and how to make your own Velocity ResourceLoader.

    The guys I'm working with are migrating a rather huge ERP app from mainframe to Java multi-tier. This isn't exactly an quick and easy job, so basically my first step here is to find out a lot of things about the old system (via some boring but extremely necessary training). However, in order not to lose my so-called “Java skills”, I am also performing some tasks, mainly testing and optimization stuff, preparing the baby to face the harsh real world.

    If you haven't seen an ERP tailored for production sites before, you'll be amazed at the massive number of barcode stickers which have to be printed. They are everywhere, from production to distribution, relaying boxes, smaller boxes, bigger boxes, packs, containers, everything your mind can think of.
    These barcodes are produced on special printers which are usually connected to the production systems via serial port (IP connection is possible, but quite expensive so it's used only in very special cases, like really large warehouses).


    Barcode image

    Then, there are these rather thick clients (SWT) deployed at different points in the production/packaging/distribution workflow. Each one has its particularities, however they ALL have to print barcodes, and print them FAST.

    This barcode stuff is not as simple as you might think. Depending on the specific point in the workflow, a different barcode must be printed, containg different data or maybe similar data but in other printable formats. This is a perfect fit for a tool such as Velocity.

    The main issue here is that the templates are not in the filesystem, but they are extracted from a central database, where they are stored and managed by specific tools (an IDE-like tool is used to position different barcode elements on the printed stickers). The first [and easiest] solution was to use the Velocity.evaluate() function. This one-liner worked just fine until the performance test, where it was decided that the barcode generation is slow. It wasn't apparent at first, but you see – in packaging half a second is a pretty long time and the cummulated delays might make the customer lose some serious money at the end of the day.

    The first idea was to look for a way of using Velocity's ResourceLoader, thus being able to forget the usage of evaluate() and use the classic VelocityEngine-Context-Template-merge mantra #:

    import java.io.ByteArrayInputStream;
    import java.io.InputStream;
    import org.apache.commons.collections.ExtendedProperties;
    import org.apache.velocity.exception.ResourceNotFoundException;
    import org.apache.velocity.runtime.resource.Resource;
    import org.apache.velocity.runtime.resource.loader.ResourceLoader;

    public class MyResourceLoader extends ResourceLoader { public void init(ExtendedProperties arg0) {
    // TODO Auto-generated method stub }
    public InputStream getResourceStream(String templateName) throws ResourceNotFoundException {
    //TODO: exceptions here
    InputStream in= new ByteArrayInputStream( (TemplateContentsSingleton.getUniqueInstance().getTemplate(templateName)).getBytes());
    if (in == null) { String msg= "*** BuiltInTemplateResourceLoader Error: cannot find resource " + templateName;
    throw new ResourceNotFoundException(msg);
    }

    return in;
    }
    [...]
    }

    This is pretty “spike-ish” and completely non-thread-safe code, kids don't try this at home without correctly processing all errors and managing modification flags. Basically, the loader is using a TemplateContentsSingleton class that wraps inside it a hashmap of templates, indexed by their key (which by the way is a String, and that's just about perfect).

    public class TemplateContentsSingleton {

    /** unique instance */
    private static TemplateContentsSingleton sInstance= null;
    /** template containers */
    private Map tmplContainers= new HashMap();
    /** * Private constuctor */
    private TemplateContentsSingleton() {
    super();
    }
    /**
    * Get the unique instance of this class.
    */

    public static TemplateContentsSingleton getUniqueInstance() {
    if (sInstance == null) {
    sInstance= new TemplateContentsSingleton();
    }
    return sInstance;
    }
    public void setTemplate(String key, String templateContent) {
    this.tmplContainers.put(key, templateContent);
    }
    public String getTemplate(String templateName) {
    return (String) this.tmplContainers.get(templateName);
    }
    }

    For your extreme comfort, this is an ultra-classic singleton contaning a hashmap.
    Don't forget to initialize Velocity with the corresponding properties :

    Properties veloProps= new Properties();
    veloProps.setProperty("resource.loader", "custom");
    veloProps.setProperty("custom.resource.loader.description","Customized Velocity Template Resource Loader");
    veloProps.setProperty("custom.resource.loader.class", MyResourceLoader.class.getName());
    veloProps.setProperty("custom.resource.loader.path", "");
    veloProps.setProperty("custom.resource.loader.cache", "false");

    Finally we'll be able to test the effect of caching different objects. Obvious candidates for caching are the template and the context. We'll render one of the (pretty small) templates 1000 times, then compute the mean rendering time. We'll do the benchmarking with Velocity caching disabled and enabled (by setting custom.resource.loader.cache to “true”). OK, let's get to work :

    timing graph for 100 rendering



    We see clearly that there is basically no performance difference between Velocity caching and “hand-picked” objects caching. However, wrapping String templates inside a Velocity ResourceLoader gives us an important speed improvement in template rendering, varying from a 10x (cache off) to 4x (cache on) factor. Interesting and rather unexpected here is that even with plain simple Velocity.evaluate() – Velocity caching decreases the merging time (probably Context caching). Meaning that a simple property set could speed up the barcode generation dividing by 2 the time necessary for the merge. Sometimes it really pays off to read the documentation.

    In conclusion, use a custom ResourceLoader and don't forget to enable caching for Velocity maximum performance. Well, this is nice but simple; right ? Something crunchier probably in the next episode …


    # Syntax coloring graciously provided by Codepaste

    Written by Adrian

    March 1st, 2004 at 5:10 pm

    Posted in Tools

    Tagged with ,