Netuality

Taming the big bad websites

Archive for the ‘Java’ tag

Hallowed be thy tablename !

leave a comment

If you haven't had the opportunity to work on a really big project, naming is probably not on your top list of programming best practices. And you are certainly going to regret that when your project grows.

Of course, everybody, including good old Scott, knows that CUST signifies CUSTOMER and DEPT signifies DEPARTMENT. And statistically speaking, the chances for these abreviations to mean something else is very small – as long as your domain model is, also, quite small. But, when the number of classes in the domain is in the hundreds or even in the thousands you'll suddenly find out that CUST may signify CUSTOMS (as in 'Customs Tax'), CUSTOMIZATION or even CUSTARD. I am working right now in the development team of an ERP for agro-food industry and wouldn't be amazed to see such an attribute name. I've seen worst, some details of the implemented business model are a total blasphemy for human logic and common sense.

Anyway, the problem is even worse in these big projects because domain model classes are not written by hand, they are generated. While this is hardly a novelty for you (please don't laugh in the audience), it also means that analysts are composing the datamodel, then classes/mappings/SQL schema/docs are generated, finally programmers will write the business logic and infrastructure integration using the generated artifacts. Names are usually propagated all along the generation chain. And when a programmer finds 'Cust' in the name of an attribute, how does she know it's a 'Customer' and not some 'Custard' ? Especially when the documentation is scarce and the author analyst is in a well-deserved six-months sabbatical in Anctarctica.

Hence, the need for standardization. This is usually done via a dictionary containing the abbreviations and their meaning(s). The rule is very simple : every word in the datamodel must be composed of abbreviations from the dictionary. Some programmers might argue that there is no need for abbreviations and full words are ok – lovely code such as '.getSecondaryBillingAddressForService(currentBill.getBillableServicesList(i).get(currentService)).getStreet().getName()'. This is perfectly understandable, however let's not forget that some databases (Oracle, Sap DB, etc.) have issues with table and column names longer than 32 characters, like for instance refusing to create it in the first place. Which is mildly bothersome if you use a relational database*.

And the golden rules of domain model naming are :

  • Be a pedantic bastard. Don't just throw the dictionary in the wild and tell people 'yeah, pleease follow this standard'. Make automatic checking on every piece of datamodel feeding the code generator. The automatic checking should be done at each save operation if possible. I have implemented this inside an Eclipse plugin used by the project analysts: when hitting save on an entity containing invalid names, a window will immediately pop up and inform about the errors. Don't just display the errors, but completely forbid saving if the entity has naming issues. This will keep the naming absolutely pristine, however the analysts might be tempted to create a lynch mob. Do not give up.
  • Avoid synonyms, plurals, etc. This is a software product, not a grammar contest.
  • Throw some stats on the mail from time to time to tell how well the model is named. People will like that.

My current gig involves, among other interesting stuff, managing the naming tools in the various Java projects that we are developing. Unfortunately, the naming rules were not really enforced (they had no pedantic bastards before me ?), so the domain model is only partially compliant. Hence, I'm in the midst of developing tools for automatic renaming of model and the new code is going to disrupt the activity for a while (thank God for autocompletion features in modern IDE's !). Things would have been much smoother if the naming was enforced from the beginning. I think there is not such thing as 'too late' to put naming in order in a big project. And it'll absolutely be done, because there's very strong managerial support for this kind of tasks (main company shareholder and CEO is a former programmer himself, as well as a quality buff – 'when time permits'™).

Unfortunately, I had to allow some 'non-compliant' islands of code in the modules which are already deployed at customers. But, have no false hopes, sooner or later I'm gonna get that code too. I'm a pedantic bastard, and proud of it.

* Now, if you're using a wanabee storage solution like Prevayler to store gigabytes of business data (or more!), you have much bigger problems than naming. Please stop reading this article and do something about it.

Written by Adrian

June 26th, 2004 at 11:54 pm

Posted in Process

Tagged with , ,

… and a few lesser known Java tools

leave a comment

Very very busy lately, but I'd like to share some knowledge about a few useful Java OSS gems that were not easy to find. Mr. Google, please 'index this':

1. Aspirin is a self-contained SMTP server (send only) written in Java, open-sourced and free. It simplifies configuration and deployment by allowing your app server to send emails without passing through an external SMTP server. The project is heavily inspired from Apache James code (thus its licencing terms). The few problems I see right now are : possible performance issues when sending big volumes of mail, behavior still erratic (sometimes sending fails without plausible reason), failure reports which do not provide reasons of failure. However, the thingie works pretty well and is a big time saver because, well, configuration is not the most pleasant part of a complex server.

2. If you produce a lot of reports and want to send them automatically on a remote printing server you may use JIPSI (quickstart in English, but site in German) which implements CUPS as a Java Print Service API. This little beauty was found by one of my coworkers and the 'report guys' seem to be making good use of it.

3. You're in for some serious processing on OpenOffice documents using the freely available DTD's (downloadable from the OOo CVS server) ? Then hold your horses ! I've tried to make sensible use of them and failed abruptly. Let's just say that those DTDs are a big pain in the a**: to begin with, no tool is able to transform them into a schema. I've tried XmlSpy and a few other exotic softwares, without success. Even basic stuff like parsing with a validating parser does not work. So much for the usefulness of open standards. Eventually, I have ended up by using the excellent Writer2Latex. Don't be fooled by the name, you may do all sorts of conversions with it, including Writer to XHTML, which I was interested in. You can even write your own plugin to boot some exotic formats, because Writer2Latex is built around a modified version of XMerge. Officially, XMerge is the solution for visualizing documents on 'Small Devices', but it really is a fancy plugin-based document converter. Most probably (too lazy to check the sources), SAX-based with a nonvalidating parser. Go figure.

4. The Eclipse download site has now links to a BitTorrent tracker. I just used it succesfully in order to download RC3 at a reasonable download rate (anyway, being on wifi right now I wasn't expecting blazing speed). I found interesting that all the other peers were using Azureus, a torrent client written in Java+SWT. Azureus is a fantastic source of knowledge, choke full of tips and techniques for writing professional-looking and very responsive SWT apps. But not only : Azureus is also a great example about how to write a plugin-ready app, which performs automatic updates from the net. Not bad, at all.

Written by Adrian

June 21st, 2004 at 8:14 pm

Posted in Tools

Tagged with , , ,

(Almost) distributed CVS with Eclipse

leave a comment

NOTE : unfortunately, this trick is again unavailable starting with Eclipsev3M9, the last version which made that possible was v3M8



In Eclipse 3.0, in the CVS repository properties it is possible to separate the 'read' and 'write' access locations (see picture). AFAIK, this feature was meant for developers using extssh for CVS connection. They would use an anonymous account for updates in 'clear', and their respective accounts for commits over encrypted link. The basic idea is that clear connections are considerably faster than encrypted ones. However, you may change the whole connection string, not only the login and protocol, thus enabling the usage of two completely different repositories for 'read' and 'write' actions on CVS.

This might come handy in situations like the one we've recently been confronted with, in my team. The main CVS repository is located at a certain geographical distance (in the headquarters, in France) and the VPN bandwidth is nothing to brag about. Working with CVS was decent, but things have started to got meaner lately, mainly because of three reasons:

  • both teams have grown = more activity on the repository,
  • vast majority of developers is integrating frequently, aka every little bit of functionality is committed as soon as it's stable and working. Of course, before committing, there's the mandatory update to check for consistency against the most recent codebase. This means that at least a few synchronizations will be performed by each member of the team, each day.
  • Code generators. It's true that common sense dictates that nothing generated should be stored in CVS, because this may be source of frequent conflicts and loads the repository in an inefficient manner. However, when one routine generation (for each of the few modules) may take between 3 and 10 minutes and produces thousands of files, it becomes pretty obvious that it should not become part of the build process. What gives : in the days when the model has some important changes, a few different subsequent versions of 3-15MB jar files are committed on the repository. Update process starts to slow down and soon a part of the team is lagging, waiting to download the update. The irony is that usually everybody is downloading slowly and inefficiently the SAME big file. Of course, there are also other kinds of traffic via the VPN, such as Netmeeting (but you can't really have a conversation when everybody in the team is updating the codebase and slows the VPN to a crawl).

This boils down to having a 'read-only' local repository, perfect mirror of the main CVS server, which will be used only for updates. Both the server and its mirror are Linuxes. My choice for mirroring was good old rsync. While cvsup seems to be all the rage nowadays, I headed towards rsync because a) it comes pre-installed within any decent Linux distro and b) I am using Gentoo on my home desktop, so rsync is a tool that I learned to use [and abuse] almost daily, via portage.

We won't lose a lot of time explaining how to set up an rsync server, since it's very well explained in this rather old but useful tutorial. There's just a small twist : we were planning to synchronize frequently so we ran rsync daemon independently, not via xinetd. The config file on the server:

2;root@dev  /etc> more rsyncd.conf
motd file = /etc/rsyncd.motd
log file = /var/log/rsyncd.log
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock

[cvs]
path = /opt/cvsroot
comment = CVS Rsync Server
uid = nobody
gid = nobody
read only = yes
list = yes
hosts allow = 10.0.10.100/255.255.255.0

is a classical read-only, anonymous (we're on VPN, right ?) mirroring setup. Next step is making a rsync service in /etc/init.d or appending the line “/usr/bin/rsync –daemon” to /etc/rc.d/rc.local, to be sure that the daemon restarts after reboot [especially when you are NOT administrating the server].

On the client, there are some small tricks to make it work. First one, setup your client CVS repository in the same location as on the server, '/opt/cvsroot' in our case (because you are going to synchro CVSROOT as well, which contains absolute paths in some if its files).

The mirroring script is something in the spirit of:

#!/bin/bash

source /etc/profile
cd /opt/cvsroot

RUN=`ps x | grep rsync | grep -v grep | wc -l`
if [ "$RUN" -gt 0 ]; then
#already running
exit 1
fi

rsync -azv --stats 10.0.3.193::cvs/CVSROOT/history /home/cvs/history >/tmp/histo

sum1=`sum /home/cvs/history`
sum2=`sum /opt/cvsroot/CVSROOT/history`

if [ "$sum1" = "$sum2" ]; then
#nothing to do
date > /tmp/empty
exit 0
fi

date > /tmp/started
cat /dev/null > /tmp/ended
rsync -azv --stats --delete --force 10.0.3.193::cvs /opt/cvsroot > /tmp/status
cd /opt
date > /tmp/ended


This (badly written) script is heavily inspired from a script found on a Debian user mailing list:

  • exits in the case of a long running rsync process
  • assures that the full rsync is triggered by changes in CVSROOT/history. This is a neat trick which minimizes the server activity if you are syncing frequently, as we will do.
  • outputs stuff in some files in the /tmp directory. This has a double purpose. First is avoiding root mailbox pollution (because we're 'cron-ing' it, the output will be visible as a few hundreds of mails each day). Second is providing data for a web page on the client machine which tells at a glance what is the synchronization status. AFAIK rsync will not write incomplete files, but the usual sound advice is to make an update between successive synchronizations.

The small 'synchronization status' page (left as an exercise for the reader :) ) just prints the dates and the last few lines of the output files; this is a dumbed-down sample :

CVS synchronization started at Mon Feb 23 09:45:18 EET 2004
, ended at Mon Feb 23 09:46:00 EET 2004

Last empty synchronization recorded at Mon Feb 23 09:40:05 EET 2004

No knowledge about recent deleted files

Last synchronized
_/modules/postpreparation/tools/
_/modules/postpreparation/tools/velocity/
_/gateway/
_/src/mailer/
_/src/old/
_/src/tools/
_/src/tools/Attic/
_/tools/
CVSROOT/history
wrote 60767 bytes read 471522 bytes 13142.94 bytes/sec
Last added files
_/ihm/AlignementChamps.java,v

The only step left is to create a file in cron.d containing the line:

0-59/5 7-23 * * * cvs your_mirroring_script.sh

(meaning sychronization each 5 minutes from 07:00 to 23:00) and you may start enjoying blazingly fast CVS updates in Eclipse.

Written by Adrian

June 2nd, 2004 at 1:44 pm

Posted in Tools

Tagged with , , , ,

Comparing FOP and JasperReports

one comment

Anybody looking for OSS reporting solutions in Java usually has to make a choice between Apache FOP and Jasper Reports*. While having somewhat different feature sets and addressing distinct reporting solutions, the two APIs boil down to the same basic thing : generate a report from an XML file (or stream/string/whatever). FOP has a clear advantage of standardization (based on XSL-Formatting Objects) while Jasper plays more in the pragmatic field of obtaining those 80% results with a minimum of effort and uses a proprietary XML format.

But FOP is not a standalone reporting solution : it's just a way of transforming XSL-FO files into a report. In order to fill the report with the necessary data, the obvious choice is a templating engine such as Jakarta Velocity. Thus a FOP report creation is a two-step operation :

  • create the XML report via Velocity
  • feed the XML stream to FOP

Jasper alleviates this problem by including its own binding engine, the only restriction being that input data should support some constraints (such as putting your 'rows' inside a JRDataSource).

Both Jasper and FOP allow inclusion of graphic files inside, usual formats (GIF, JPEG) are supported, however FOP has a nice bonus of rendering SVG inside reports. Unfortunately, this comes with the price of using Batik SVG Toolkit, which is a bulky (close to 2MB) and rather slow API. While processing your dynamic charts as XML files (Velocity again) is a seducing idea, the abysmal performance of SVG rendering will make you give up in no time. Unfortunately, I speak from experience.

At first sight, FOP has a lot more options for output format, compared to Jasper Reports. Of course there's PDF and direct printing via AWT, but also Postscript, PCL, MIF as well as SVG. These choices are quite intriguing, since Postscript and PCL are printing formats (easily obtained by redirecting the specific printer queue into a file), MIF is a rather obscure Adobe format (for Framemaker) and SVG … well, a SVG report is too darn slow to be useable (yes, I was foolish enough to try this, too). Jasper makes again a pragmatic choice by allowing really useful output formats such as HTML, CSV and XSL (never underestimate the power of Excel); and of course: direct printing via AWT and PDF.
While FOP's latest version (0.20.5) was released almost a year ago (summer 2003), Jasper Reports is bubbling with activity – Teodor releases a minor version each one or two months (latest being 0.5.3 at 18.05.2004).

I've decided to use as a 'lab rat' one of the apps developed during my 'startup days': the client GUI is written in Swing and features a few mildly complex reports generated using Velocity+FOP. FOP version is 0.20.4 (the current version back in Q1-2003, when we had to quit dreaming about the 'next financing round' and development halted) but as I already told you FOP has evolved little since then. Though, it's perfectly reasonable to use this implementation as a witness for comparison with Jasper (on the opposite, Jasper has evolved a great deal since Q1-2003).

Back then, the report development cycle was quite simplistic. In fact, the XSL-FO templates were written by hand inside a text editor and the application code was run (via a Junit testcase and some necessary configuration and business data mocking) in order to generate a PDF report. In the case of errors, we had feedback by examining the error traces. Visual feedback was given by the PDF output. While simple to perform, this cycle was extremely tiresome after a while as there was an important overhead : start a new JVM, initialize FOP, fire Acrobat Reader (plus we were using some crappy – even by the standards of 2003 – 1GHz machines w 256/512MB RAM). A WYSIWYG editor would have been nice, so one of my coworkers has made some research and the only solution he found was XMLSpy (Stylevision not available back then) – but, at 800USD/seat this was 'a bit' pricey** for us (only the Enterprise flavor covers FO WYSIWYG editing !?). Another interesting idea was to use one of the conversion tools (from RTF to FOP) such as Jfor, XMLMind or rtf2fo (of these products, only Jfor is free, but feature-poor). What stopped us from doing it was that the generated FO was overly complex : we needed comprehensible cut_the_crap files because we were going to integrate inside Velocity templates. And when you have tens of tags and blocks inside blocks and not the slightest idea which one is a row, which one is a column and which one is a transparent dumbass artefact, it's a gruesome trial-and-error task to integrate even simple VTL loops. And you'd have to do this each time you change something in the report : yikes ! Conclusion : the report development cycle was primitive for FOP and there was no way we could change it.

Things are quite different for Jasper Report : there are a lot of available report designers, and some of them are free. While the complete list is on Jasper Report site, I'd like to note at least three of them :

  • iReport is a Swing editor and very interesting because it's not only covering the basic Jasper functionality but also supplementary features such as barcode support (which is admittedly as easy as embedding a barcode font in Jasper with two lines of XML, but much easier to make it via a mouse click). iReport is free, which is excellent, but is a standalone app without IDE integration, and as any complex Swing app is quite slow and a memory hog.
  • if you are a developer using Eclipse, you'd appreciate two graphical editors based on Eclipse GEF, available as Eclipse plugins : JasperAssistant and SunshineReports. None of them is free and, at least on paper, the functionality seem identical, but SunshineReports has only the older 1.1 version downloadable, which is free but does NOT work with recent builds of Eclipse 3. How the heck am I supposed to test it ? On the contrary, Assistant has a much more relaxed attitude allowing the download of a free trial for the latest version of their product. Maybe too relaxed, though, because – even if (theoretically) limited in number of usages – you can use the trial as much as you want to***. But if you are serious about doing Jasper in Eclipse you should probably buy Assistant, available for a rather decent 59USD price tag. I am currently using it and it's a good tool.

So much for the tools, let's get the job done. The bad part : if you're experienced with FO templates, don't expect to be immediately proficient with Jasper, even with a GUI editor. The structure of an FO document has powerful analogies with HTML : you have tables, rows, cells, stuff like that, inside special constructs called blocks. It's relatively easy to use a language such as VTL in order to create nested tables, alternating colors and other data layout tricks. You can even render a tree-organized data via a recursive VTL macro, and everything is smooth and easy to understand. Jasper is completely different and at first sight you'll be shocked by its apparent lack of functionality : only rectangles, lines, ellipses, images, boilerplate text and fields (variable text). Each one of this elements has an extensive set of properties about when the element should be displayed, stretch type, associated expression for value and so on. Basically, you'd have to write Java code instead of Velocity macros and call this code from the corresponding properties of various report elements. If at the beginning it feels a little awkward, after a while it comes quite natural and simple. As for nesting and other advanced layouts, there is a powerful concept of 'subreport'. And yes I've managed to render a tree using a recursive subreport, but given the poor performance the final choice was to flatten the data into a vector then feed it into a simple Jasper report. So pay attention to the depth of 'subreporting'.

Once the reports were completely migrated, I've benchmarked a simple one (without SVG, charts, barcodes or other 'exotic' characteristics). The test machine is a 2.4GHz P4 w 512MB Toshiba Satellite laptop. In the case of FOP, the compiled velocity template and the FOP Driver are cached between successive runs. In the case of Jasper, the report is precompiled and loaded only on first run, then refilled with new data before each generation. The lazy loading and caching of reporting engines is the cause of important time differences between the generation of the first report and the subsequent reports. Delta memory is measured after garbage collection. The values presented are median for 10 runs of the 'benchmark report'.

  First run Subsequent runs Delta memory
Velocity + FOP 10365ms 381ms 850KB
Jasper Reports 1322ms 82ms 1012KB

While I am totally pro-Jasper after this short experiment, it is important to note that commercial and well-maintained FO rendering engines such as RenderX XEP claim improved performance upon FOP. Depending on your requirements, environment and reporting legacy apps, an FO-based solution might be better, especially when report generation is only on server-side.

Of course, usual disclaimer apply: this benchmark is valid only for my specific report in my specific application so YMMV.

* While I am aware that other OSS solutions do exist for Java, I consider these two as 'mainstream'.

** Did I mentioned that we were a startup with financing problems ?

*** No, I'm not going to explain here how it can be done.

Written by Adrian

May 25th, 2004 at 8:57 am

Posted in Tools

Tagged with , , , ,

Re: Where are all the SWT applications?

leave a comment

Mr. Jesus M. Rodriguez asked fellow Java bloggers where are all those SWT apps. Here's a possible answer.

First, do not forget that Eclipse is a big ecosystem. There are a bunch of software development shops making a living from selling Eclipse plugins : Omondo, Genuitec, Instantiations, Exadel, etc. Add here the vast majority of the older/bigger tool producers (such as Borland/Together, Visual Paradigm …) which ported their products as Eclipse plugins. Count also important product suites from (of course) IBM (WSDD, Rational, etc.), (for sure) Lotus and (surprise ?) Novell and SAP. There is life in here and a great deal of it is due to SWT. There are a few strange Swingonian hybrids, but most of the plugins and products are purely SWT-based.

SWT is still young, so unfortunately there are only a few interesting opensourced projects based on SWT. Among them there's Azureus, a BitTorrent client which is consistently in the top 5 downloads at SourceForge. I have noticed an interesting fact : Azureus is THE ONLY desktop Java app (which is not an IDE or a development tool) and which I consistently find installed on different systems (friends, coworkers). Add to the list Sancho, a P2P multiprotocol client, SQLAdmin another rdb administration tool and RSSOwl plain simple RSS feedreader. These are all decent performing apps and their sources are ripe for grabbing and inspiration.

But this is just the tip of the iceberg. The Eclipse newsgroups are full of people discussing RCP development strategies and asking SWT questions. And – you know – they're not all unemployed ex-dotcommers with Yahoo emails. But let's not make speculations about people email addresses – let's just look at an interesting existing product : GDFSuite is a whole suite/platform targeted at geobusiness applications. In their presentation from EclipseCon 2004, Frank Gerhardt (SENS) and Chris Wege (DaimlerChrysler AG) shared a few development experiences related with GDFSuite rich client, built upon the Eclipse platform. I bet there are a few other important RCP-based products which will be announced during 2004.

Last, but not least, I have some personal experience to share as well. My current employer sells a production supervision tool targeted at food industry, with a SWT-based rich client. When I say “sells” I mean “already sold quite a few licenses and currently successfully installing at big customers”. The SWT-client is deployed in its Motif version on P3/500Mhz machines running Linux. It's fast and slick. It's not very aesthetically pleasing and you can not play Solitaire on it, but this doesn't seem to interest the guys cutting meat or the girls which are inspecting the barcodes on the package of your next 1.99$ steak. Some other SWT-based products are undergoing development in our Java department, but of course they are really really really secret, they will be announced as soon as they are ready and they'll kick some serious ass ;) There are probably quite a lot of other companies capitalizing on SWT to make fast (or 'rich') clients for their commercial products.

Swing is beautiful. Swing is conceived with MVC in mind. Swing has Jgoodies. Swing is also slowly-moving and memory-bulimic. When all I have to do is a simple interface with a text displayed in big fonts and a “Print barcode” button, does it really have to pass through 10 nested layers of listeners ? What shall I do with Swing on a CPU-handicapped computer in production environment, where 200ms is considered a big delay ? But of course, if you want intellectual challenges and nicely architectured clients, go for Swing. If all you need is to be efficient and please the customer, take SWT for a ride.

I know, I know – I may sound like the VB advocates a few years ago. But it's not the point. It's a matter of doing the right compromise between a rich functional API and a simple fast API, between generalization and the need for custom coding. My opinion is that Swing has – for the moment – failed to reach the good compromise. SWT it's just the next candidate.

PS Jess is a dual-licensed inference engine for Java. The next version (7 aka 'Charlemagne') includes an Eclipse-based IDE. Unfortunately, not being a licensed Jess user I do not have access to latest beta versions. Maybe somebody could shed some light upon this question.

PPS There's some more stuff at SWT Community Page

PPPS A bit of history : Swing (JFC) was made public in april 1997. SWT became available to developers, along with the Eclipse donation, in november 2001.

Written by Adrian

April 4th, 2004 at 9:52 am

Posted in Tools

Tagged with , , ,

I’m amazed : I own a PC !

leave a comment

Two interesting predictions:

“McNealy also predicted that five years from now, most people will be amazed they ever owned a PC.”
June 5, 1998

“This agreement will be of significant benefit to both Sun and Microsoft customers. It will stimulate new products, delivering great new choices for customers who want to combine server products from multiple vendors and achieve seamless computing in a heterogeneous computing environment.”
Mc Nealy, President and CEO Sun Microsystems, April 2, 2004

Written by Adrian

April 3rd, 2004 at 4:32 pm

Posted in AndEverythingElse

Tagged with , ,

Getting rid of the dreaded o.e.c.i.r.AssertionFailedException

leave a comment

It you have ever written even the smallest standalone app using SWT and Jface library, then you must know (and hate!) org.eclipse.core.internal.runtime.AssertionFailedException. This exception has the very bad habit of substituting the initial one without keeping any trace of it. The official explanation is the following :

“Jface is not guaranteed to work in standalone mode. As a matter of fact, Jface tries to report different errors using the Eclipse plugin mechanism, thus provoking a fatal assertion failure which is the only exception being reported.”

Ryan Lowe suggests wrapping [the code] in a try/catch block temporarily. Yeah sure this might just work if you know WHAT to wrap. What do you do when, after multiple commits and updates of the same source file modified by different people, you have the failed assertion ? Wrap in a try/catch every slice of code committed in the last few hours ? Hmmm … don't think so.

An interesting solution would be (as proposed by Ryan) to use aspects in order to wrap automatically each and every SWT call. But this might bring serious performance issues and I also have some doubts that it will catch everything.

But boy, aren't we lucky that Eclipse is opensourced ? Just by looking at the stack trace (which is the same over and over no matter where your real error is) you can spot the guilty : org.eclipse.core.internal.runtime.InternalPlatform (found in runtime.jar/runtimesrc.zip). The little bugger is Mr. private static boolean initialized. Putting it on true involves starting the platform, which is a quite complex process judging by the loaderStartup(URL[] pluginPath, String locationString, Properties bootOptions, String[] args, Runnable handler) throws CoreException method. Unfortunately, being a private variable means that reflection won't work and dynamic proxying won't work, either*. AKA Out Of Ideas (TM)

So, it's time to stop trying to be smart. I'll just be a rude dumb programmer and simply modify the source, recompile and replace the .class files in the runtime.jar. It's extremely simple yet sharply efficient:

  • initialize “initialized” on true
  • in the method private static void handleException(ISafeRunnable code, Throwable e) replace all the content with the following line : e.printStackTrace();code.handleException(e);. A stack in the console is enough for me at this stage, but of course you may use your favorite logging infrastructure classes to report the exception. The code.handleException call has (at least in my app) the effect of displaying an innocuous dialog box telling the user that something went wrong. Just the right dose of details aka nothing at all (don't scare the poor user, please).

Well, what's more to say ? This stupid trick simply works.

On second thought, I might use RCP. Maybe, in a future episode.

*AFAIK Please correct me if I'm mistaking here.

PS That was quick “Setting the accessible flag in a reflected object of java.lang.reflect.AccessibleObject class allows reading/modifying a private variable that normally wouldn't have permission to. However, the access check can only be supressed if approved by installed SecurityManager.”. Humm, ok, so there IS a smarter solution after all. Left as an exercise for the reader :)

Written by Adrian

March 25th, 2004 at 8:23 pm

Posted in AndEverythingElse

Tagged with ,

Book review – Tapestry In Action

leave a comment

My first contact with Tapestry was more than 18 months ago. Back then, I was interested to find a web framework for integration with our custom Avalon-based (using the now-obsoleted) Phoenix server*. The web interface was ment to be backoffice stuff, for simple administration tasks as well as statistics reports. Given that the data access and bussines logic were already developed, we were looking for something simple to plug into a no-frills servlet container such as Jetty. we managed very easily to integrate Jetty as a Phoenix service and pass data through the engine context. But when we finally integrated Tapestry [into Jetty [inside Phoenix]] and make it display some aggregated statistics, the project funding was cut and the startup went south. But, that’s another story and rather uninteresting one.

Meanwhile, things have changed a bit. Tapestry had become a firsthand Apache Jakarta project, the Tapestry users list is more and more crowded, and again I see it used in my day work (by Teodor Danciu, one of my coworkers and incidentally author of Jasper Reports) and doing some moonlighting by myself for an older web project idea. And there is exceptional Eclipse support via Spindle plugin. While the ‘buzzword impact’ on Tapestry on a Java developer CV doesn’t yet measure up with Struts, this framework has obviously gained a lot of attention lately.

So, what’s so special about it ? If I’d have to choose only one small phrase I’d quote Howard Lewis Ship, Tapestry lead developer, from the preface of his book ‘Tapestry in Action’:

The central goal of Tapestry is to make the easiest choice the correct choice.

In my opinion this is the weight conceptual center of the framework. Everything, from the template system which has only the bare minimum scripting power, passing through the componentized model, up to the precise detailed error-reporting (quite unique feature in the opensource frameworks world) gently pushes you (the developer) to Do The Right Thing. To: put logic where it belongs (classes not templates), organize repetitive code in components, ignore the HTTP plumbing and use a real, consistent, MVC model in your apps (forms are readable and writable components of your pages). You don’t need to be Harry Tuttle to make a good Tapestry webapp, just a decent Java developer is enough. That’s more than I can tell about Struts …

Coming from a classic JSP-based webapp world, Tapestry is really a culture shock. The most appropriate way to visualise the difference is to imagine a pure C programmer abruptly passing to C++, into the objects world**. For a while, he will try to emulate the ‘old’ way of work, but soon enough he’ll give up and start coding his own classes. However, this C programmer will have to make some serious efforts, not necessarily because OOP is hard to leard, but in order to break his/her old habits.

“Tapestry in Action” is your exit route from the ugly world of HTTP stateless pages and spaghetti HTML intertwingled with Java code and various macros. It’ one of the best JSP detoxification pills available on the market right now.

The first part of the book (‘Using basic Tapestry components’) is nothing to brag about. It’s basically an updated and nicely organized version of the various tutorials already available via the Tapestry site, excepting probably some sections in chapter 5 (‘Form input validation’). By the way, the chapter 5 is freely downloadable on the Manning site and is a perfect read if you want a glimpse of the fundamental differences between Tapestry and a classic web framework (form validation being an essential part of any dynamic site). However, if you want to go over the ‘Hangman’*** phase you really need to dig into the next two book sections.

The second section ‘Creating Tapestry components’ is less covered by the documentation and tutorials. I’m specifically pointing here to the subsections ‘Tapestry under the hood’ (juicy details about pages and components lifecycle) and ‘Advanced techniques’ (there’s even an ‘Integrating with JSP’ chapter !). While it is true that any point from this chapter will generally be revealed by a search on Tapestry user list or (if you’re patient) by a kind soul answering your question on this same list, it’s nethertheless a good thing to have all the answers nicely organized on the corner of your desk.

The third and last chapter (‘Building complete Tapestry application’) is a complete novelty for Tapestry fans. It’s basically a thorough description of how to build a web application (a ‘virtual library’) from scratch using Tapestry. While the Jboss-EJB combination chosen by the author is not exactly my cup of tea (I’m rather into the Jetty+Picocontainer+Hibernate stuff) I can understand the strong appeal that it is suppposed to have among the J2EE jocks. Anyway, given the componentized nature of Tapestry, I should be able to migrate it relatively easily if I feel the need for it. The example app is contained in a hefty 1Meg downloadable archive of sources, build and deployment scripts included.

To conclude, ‘Tapestry in Action’ is a great book about how to change the way you are developing web applications. The steep learning courve is a little price to pay for a two or three-fold improvement in overall productivity. And this book should get you started really quick.

*Which AFAIK is still used at our former customer.
**There were some posts on Tapestry user list on about a certain conceptual ressemblance with Apple’s WebObjects. I can’t really pronounce upon this because I do not know WebObjects, but the name in itself is an interesting clue.
***’Hangman’ is the Tapestry ‘Petshop’ (although there is also a ‘real’ Tapestry Petshop referenced in the Wiki).

Written by Adrian

March 18th, 2004 at 2:56 pm

Posted in Books

Tagged with , , , , ,

Eclipse plugins and Groovy : when binary compatibility is not enough

leave a comment

One of my current responsibilities is to maintain an internally developed plugin, used by various members of the team to generate code from the analysis model. As far as I can tell by the webstats of the update site, every version is downloaded by 18 people, a small but heterogeneous user base.

My biggest problem is the Eclipse version. The analysts are not exactly Java geeks waiting anxiously for nightly builds of Eclipse, they use a 'standard' 2.1.2, mainly because it's stable and well internationalized. Things go wilder in the programmers team : versions ranging from conservative (2.1.x) to liberal (3.0M4) and even the occasional dumbass with the latest integration build (that would be me, of course).

The 'enhanced binary compatibility' in 3.0M7 came as a relief, diminishing the need to switch between Eclipse versions in order to develop the plugin or work on other tasks. Well, I still have to briefly test the damn thing on Eclipse 2.1.x before releases. However, running simultaneously two or three Eclipse instances is no piece of cake for my 512Mb laptop (I still haven't found who I have to kill here in order to be awarded a memory upgrade). Unfortunately, checking out the plugin source into M7 has shown the invisible ugly face of 'binary compatibility': the plugin doesn't compile.

There are just a handful of lines of code, some emphasizing differences in Eclipse API which are somehow hidden in 'compatibility mode', some effectively showing small bugs in plugin behavior. But the real issue here is that I cannot really develop the plugin in M7 until I manage somehow to compile it, while not losing downwards compatibility.

Let's dissect one of the compilation issues. The bummer concerns automatic opening of an editor (or focus if already opened) when clicking on its reference (somehow similar to what happens when you Ctrl+click on a class name in JDT). In the older API it was a question of page.openEditor(file); where page is a IWorkbenchPage and file is an IFile. This simple stuff worked well until 3.0M4, then (M5) things changed to page.openEditor(new org.eclipse.ui.part.FileEditorInput(file),editorId); where FileEditorInput implements (among others) an IEditorInput. While this is certainly nice because you may directly link editors to something other than files***, obviously the old code does not compile under M7.

Maintaining different projects for 'old' and 'new' style projects for 10 or so lines of code is obviously overkill. Second solution – via reflection, but it would mean more than few lines of code and the result would not exactly be comprehensible nor maintainable. Only thing left : use a scripting language.

Of course I could have taken any decent scripting language embedded in Java. Decision to go with Groovy was taken mainly because of its coolness factor, but I am sure the idea will apply easily with Jython (big favorite of mine) or the performance-aware Pnuts, for instance.

In a nutshell, you have to execute a line of code depending of the current Eclipse version (it's a little bit trickier, but we'll discuss later about it).

groovy.lang.Binding binding = new Binding();
binding.setVariable("page", page);
binding.setVariable("file", file);
groovy.lang.GroovyShell groovyShell = new GroovyShell(getClass().getClassLoader(), binding);
if (newPlatform)
{
return groovyShell.evaluate("page.openEditor(new org.eclipse.ui.part.FileEditorInput(file),editorId);", someExpressionId);
}
else
{
return groovyShell.evaluate("page.openEditor(file);", someExpressionId);
}

It's basically a vanilla flavored ripoff of the Groovy embedding example from the docs. The boring part : caching the binding, hiding everything behind a nice facade, is left as an exercise for the [interested] reader. Remember to pass the classLoader of the current class, do not create a GroovyClassLoader out of nowhere or you'll end up dealing with Eclipse own class loader, which means trouble for simple tasks like these.

How do we know that the Eclipse version is the 'new' or the 'old' one is not that simple because remember : 'old' means 2.x up to 3.0M4. So finding out Eclipse SDK version is not enough, you have to find out another discriminant which in our case is the 'org.eclipse.ui.ide' plugin. Result:

boolean newPlatform;
//find out if we are inside a new or an old platform
PluginVersionIdentifier pvi = Platform.getPluginRegistry().getPluginDescriptor("org.eclipse.platform").getVersionIdentifier();
newPlatform = pvi.getMajorComponent() >= 3 && Platform.getPluginRegistry().getPluginDescriptor("org.eclipse.ui.ide") != null;

No, we are not ready to deploy yet. A small trick has to be performed or the plugin won't start under older versions of Eclipse. We had to add some new plugins to dependencies (in the pligin descriptor) such as the aforementioned 'org.eclipse.ui.ide', obviously the older versions of Eclipse will not find it, hence block our plugin activation on startup. In order to overcome this, you have to add (by hand !) a lesser known attribute ('optional') in the corresponding tag from the plugin.xml file : <import plugin=”org.eclipse.ui.ide”/> becomes <import plugin=”org.eclipse.ui.ide” optional=”true”/>. Now, the plugin is ready to be deployed.

For those brave enough to dare distributing such a plugin via an update site remember to 'cheat' by not allowing new plugins such as 'org.eclipse.ui.ide' in the feature.xml file (again, delete by hand). The 'optional' attribute doesn't help in this case. Go figure…

I hope that some of you will find useful this recipe for maintaining compatibility between different Eclipse versions with minimum of fuss. However, please note the specific prerequisites for this type of solution :

  • there are only simple 'few-lines' modifications
  • the code is not expected to evolve a lot in the 'affected' areas
  • the evaluated code is not in a performance-sensitive area

***Interesting enough, this was one of the reasons I recommended against adopting RCP in one of our apps, a few weeks ago. It's nice to see that – now – the mechanism linking editors and resources is MUCH more flexible. Anyway, this won't probably change the decision of not using RCP because the main issue it's the volume of code we have to change. Development of one of the app modules started almost a year ago and the animal it's already sold and deployed on different production sites: upgrading would be a real nightmare. Maintaining a fork of the app is not an option either. Well, I guess we'll just have to cope with 'plain old' Jface and SWT.

PS After some days of 'silence', I have noticed from the logs that most popular posts on my blog are those concerning Eclipse plugins and Manning books (I seem to have a nice Google ranking on these topics). So, expect more of these (I am reading the MEAP of 'Tapestry in action' – a review should be up shortly).

Written by Adrian

March 1st, 2004 at 6:36 pm

Posted in Tools

Tagged with , , , ,

Using jython to internationalize a PHP app

leave a comment

At first, this might seem a mind-boggling combination. What do
jython and PHP have in common (excepting the fact that I am a Python fan
and my current consulting task is in a PHP project) ?
Well, internationalizing a PHP app is pretty much a trivial task.
If you are a sensible PHP programmer insisting to use PEAR instead of randomly choosing a script from the tons of snippets
populating the “scripting websites”, I18N is probably the
safest choice.
Maybe – for you – application maintainability and performance are not exactly important concerns.
For me, they are. This is why I chose to store internationalized texts in files rather than database.
I'd rather keep the database for real data, which is created, modified, aggregated and such.
And I'd rather like to have an internationalized error message on the screen even if the database is down.
Now we know that we'll use I18N and text will be kept in some php files. However, I am no professional translator and
have no desire to translate or to manually maintain the correspondence between translators files and PHP files
(no, translators won't modify PHP code, stop this nonsense right away).
Code generation comes immediately in mind.
Basically, my first idea was to investigate wether the files used by the translators can be quickly transformed to PHP,
and if I am able to generate their formats from my own files (aka. “roundtrip internationalization process” ?).
Unfortunately, this is not an easy task – as the only clue was that the translators use Office tools such as Word or Excel, because they
rely upon some specialized translation software integrated with these products.
The easy choice is Excel, since it allows a better organization of data than having to search for tables in a Word document.
The hard choice is the tool that I'd use for automatically reading and even generating Excel files.

The difficulty comes from the fact I don't have Windows with Office installed on my desktop, just Gentoo Linux and OpenOffice.
Thus, I am unable to write a simple Python script which could perform my generation tasks via automation.
Fortunately, this is not the first time I am confronted with the issue.
I happen to know that there is a very nice Java tool that I wholeheartedly
recommend for your Excel processing needs :
JExcelApi.

Still, Java is a heavyweight programming language – it would be a really bad idea to fire up the

monster just for some easy processing of Excel files.
Here's why Jython comes naturally into equation. Four hours and about
100 lines of debugged code later, here I am sitting on top of a perfectly functional internationalization tool which :

  • generates PHP code from a big xls file (the root vocabulary) which centralizes all the internationalization texts
  • generates 2-language xls files for translators usage
  • updates the root vocabulary starting from the files modified by the translators
  • Automation scripts are already in cron and there's also a nice text document explaining translators where to get
    their files and where to put them after modification. The resulting script is not exactly fast, but this is tooling
    and not production so this should not be a problem after all.

    Whatever your project contraints are, give Jython a try and you'll be amazed … As they put it on the
    Useless Python site – If it were any simpler, it would be illegal.
    Finally there's a trick not quite related with Jyhon, nevertheless interesting.
    There is an easy way of solving the problem of translating phrases with real data inside them, with easy parameter swapping.
    We'll use the good old sprintf but not directly. We'll pass through a not so popular but extremely useful function,
    call_user_func_array. Suppose that our example needs the
    user name and authorization profile description to display inside a nice message. All you have to do is to define placeholders
    in I18N files which would fit as the first argument for sprintf. The following example should make it clearer:

    localization/en/login.php
    $messages = array(
    'loggedin'=>'You are authenticated successfully as user %1$s with profile %2$s.'
    );
    $this->set($messages);
    
    localization/fr/login.php
    $messages = array(
    'loggedin'=>'Vous avez le profile %2$s en tant qu'utilisateur %1$s.'
    );
    $this->set($messages);
    
    Simple passing of multiple parameters to I18N in PHP. Example function without error processing or data domain checking.
    #this is the multiple parameter function
    function complexTranslation($i18n, $label, $params)
    {
      return call_user_func_array('sprintf',array_merge(array($i18n->_($label)),$params));
    }
    
    Then, you have to initialize your I18N object. This can be done in a generic manner for all pages.
    #specific I18N initialization stuff
    require_once 'I18N/Messages/File.php';
    $g_language_dir = dirname($_SERVER['PATH_TRANSLATED']).'/localization/';
    $i18n =& new I18N_Messages_File($g_langCode,$script_name,$g_language_dir);
    
    Finally, use the function.
    #translate the successfull login message
    $loginbox = Tools::complexTranslation($i18n,'loggedin',array($operator->name,$profile->description));
    

    Written by Adrian

    March 1st, 2004 at 5:20 pm

    Posted in Tools

    Tagged with , , , , , ,