Netuality

Taming the big, bad, nasty websites

Archive for the ‘library’ tag

XML descriptors in PHP frameworks – considered harmful

2 comments

No, I am not a seasoned PHP programmer and I do not intend to become one. But we do live in a harsh economy where all IT projects are worth considering, thus my occasional incursions in the world of of PHP-driven websites.

I am not new to PHP either, but – coming from a Java world – immediately felt the need of a serious MVC framework.
Nobody wants to reinvent the wheel each time a new website is built. Just launch the obvious “PHP MVC framework” on Google and the results pages will be dominated by four open-source projects :

  • PHPMVC is probably the oldest project and implements a model 2 front controller/li>
  • Ambivalence declares itself as a simple port of
    Java Maverick project
  • Eocene a “simple and easy to use OO web development framework for PHP and ASP.NET”,implementing MVC and front controller
  • Phrame is a Texas Tech University project released as LGPL, heavily inspired by Struts.

The choice is not easy. There are no examples of industrial-quality sites built with either of these frameworks.
(some may say there are no examples of industrial quality sites built with PHP but let's ignore these nasty people for now).

There are no serious comparisons of the four frameworks, neither feature-wise nor performance-wise.
In the tradition of open-source projects, the documentation is rather scarce and examples are “helloworld”-isms.
Yes I am a bloody bastard for pointing out these aspects – since the authors are not paid to release these projects – and perhaps I could contribute myself with some documentation. However, when under an aggressive schedule I feel it's easier to write my own framework instead of understanding other people's code and document it thoroughly.
However, I have a nice hint for you. The first three frameworks are using XML files for controller initialization (call it “sitemap”, “descriptor” or otherwise; but it's just a plain dumb XML file). So you should safely ignore them in a production environment.

Because, the “controller” is nothing more than a glorified state machine. The succession of states and transitions (or “actions” or whatever) should be persisted somewhere. XML is probably a nice choice for Java frameworks, where the files are parsed and the application server keeps in memory a pool of controller objects.

But: PHP sessions are stateless. The only way of keeping state is via filesystem or database, usually based on an ad-hoc generated unique key, which is kept in the session cookie. More: PHP allows native serialization only for primitive variables; a complex object such as the controller can not be persisted easily, so it has to be retrieved from XML and fully rebuilt. Unlike in Java appservers, objects cannot be shared between multiple session, thus pooling is not an option. Thus, in PHP, the XML approach is highly un-recommended, since this means that the XML files are parsed for each page that is viewed on the site. Although PHP's parser is James Clarks's Expat, one of the fastest parsers right now (written in C), note that the DOM object must be browsed in order to create the controller object (which is becoming more and more complex as the site grows). This is called heavy overhead, no matter how you look at it.

There are a few reasons about why you need XML in a web framework, however this does NOT apply to PHP apps. Myth quicklist:

  • it's “human-readable”. Come on, PHP is stored is ASCII readable files and even if you use Zend products to compile and encrypt your code, why on earth would you allow readability and modification of the controller state machine on the deployment site ?
  • easier to modify than in code. This is probably true for Java and complex frameworks, but in PHP is significantly simpler than Java.
  • automatically generated from code by tools such as Xdoclet or via an IDE. If you're writing it in Java, because PHP does not have such tools.

This means that the only serious candidate (between these considered here) for a PHP MVC framework is Phrame, which stores the sitemap as a multi-dimensional hashmap. Thus, you should either consider Phrame or (for small < 50 screens) sites you'll be better off writing your own mini-framework, with a state machine implemented as a hashed array of arrays and some session persistence in the database. I chose to serialize and persist an array containing primitive variables, using PHPSESSID as the primary key in order to retrieve and unserialize the array, all coupled with a simple "aging" mechanism for these users with the nasty habit of leaving the site without performing logout first.

Finally a last world of advice : use PEAR ! This often overlooked library of PHP classes includes a few man-years of quality work. You'll get a generic database connection layer (PEAR-DB) along with automatic generation of model classes mapped on the database schema (DB_DataObjects), a plethora of HTML tools (tables, forms, menus) and some templating systems to choose from. All in a nice easy to install and upgrade package.

Don't put a heavy burden on your upgrade cycle using heterogenous packages downloaded from different sites on the web, just use PEAR.

Or simply ignore the PHP offer and wait patiently for your next Java project. Vacations are almost over.

Written by Adrian

October 29th, 2004 at 8:53 am

Posted in Tools

Tagged with , , , ,

Getting rid of the dreaded o.e.c.i.r.AssertionFailedException

leave a comment

It you have ever written even the smallest standalone app using SWT and Jface library, then you must know (and hate!) org.eclipse.core.internal.runtime.AssertionFailedException. This exception has the very bad habit of substituting the initial one without keeping any trace of it. The official explanation is the following :

“Jface is not guaranteed to work in standalone mode. As a matter of fact, Jface tries to report different errors using the Eclipse plugin mechanism, thus provoking a fatal assertion failure which is the only exception being reported.”

Ryan Lowe suggests wrapping [the code] in a try/catch block temporarily. Yeah sure this might just work if you know WHAT to wrap. What do you do when, after multiple commits and updates of the same source file modified by different people, you have the failed assertion ? Wrap in a try/catch every slice of code committed in the last few hours ? Hmmm … don't think so.

An interesting solution would be (as proposed by Ryan) to use aspects in order to wrap automatically each and every SWT call. But this might bring serious performance issues and I also have some doubts that it will catch everything.

But boy, aren't we lucky that Eclipse is opensourced ? Just by looking at the stack trace (which is the same over and over no matter where your real error is) you can spot the guilty : org.eclipse.core.internal.runtime.InternalPlatform (found in runtime.jar/runtimesrc.zip). The little bugger is Mr. private static boolean initialized. Putting it on true involves starting the platform, which is a quite complex process judging by the loaderStartup(URL[] pluginPath, String locationString, Properties bootOptions, String[] args, Runnable handler) throws CoreException method. Unfortunately, being a private variable means that reflection won't work and dynamic proxying won't work, either*. AKA Out Of Ideas (TM)

So, it's time to stop trying to be smart. I'll just be a rude dumb programmer and simply modify the source, recompile and replace the .class files in the runtime.jar. It's extremely simple yet sharply efficient:

  • initialize “initialized” on true
  • in the method private static void handleException(ISafeRunnable code, Throwable e) replace all the content with the following line : e.printStackTrace();code.handleException(e);. A stack in the console is enough for me at this stage, but of course you may use your favorite logging infrastructure classes to report the exception. The code.handleException call has (at least in my app) the effect of displaying an innocuous dialog box telling the user that something went wrong. Just the right dose of details aka nothing at all (don't scare the poor user, please).

Well, what's more to say ? This stupid trick simply works.

On second thought, I might use RCP. Maybe, in a future episode.

*AFAIK Please correct me if I'm mistaking here.

PS That was quick “Setting the accessible flag in a reflected object of java.lang.reflect.AccessibleObject class allows reading/modifying a private variable that normally wouldn't have permission to. However, the access check can only be supressed if approved by installed SecurityManager.”. Humm, ok, so there IS a smarter solution after all. Left as an exercise for the reader :)

Written by Adrian

March 25th, 2004 at 8:23 pm

Posted in AndEverythingElse

Tagged with ,

Scripting languages not just for toys (a Ruby web framework)

leave a comment

According to David Heinemeier Hansson, the RAILS 1KLOC (!) web framework written in Ruby was used to develop Basecamp, a mildly complex project management webapp. What's fascinating is the Ruby code used to develop Basecamp is only 4KLOC (according to the RAILS document) and was developed in less than 2man*month (including 212 test cases and all the bells and gingles). Although I strongly suspect that there was a lot of templating going on behind the hoods and I doubt that template development time was included, the efficiency is amazing, especially if you consider that they have started almost from scratch. Makes you wonder what would be possible to perform in Ruby with a comprehensive library such as PHP's PEAR (well, duh, ignoring the fact that it should be called REAR which is a very very very nasty name) ?

Written by Adrian

March 24th, 2004 at 2:07 pm

Posted in Tools

Tagged with , ,

Book review – Tapestry In Action

leave a comment

My first contact with Tapestry was more than 18 months ago. Back then, I was interested to find a web framework for integration with our custom Avalon-based (using the now-obsoleted) Phoenix server*. The web interface was ment to be backoffice stuff, for simple administration tasks as well as statistics reports. Given that the data access and bussines logic were already developed, we were looking for something simple to plug into a no-frills servlet container such as Jetty. we managed very easily to integrate Jetty as a Phoenix service and pass data through the engine context. But when we finally integrated Tapestry [into Jetty [inside Phoenix]] and make it display some aggregated statistics, the project funding was cut and the startup went south. But, that’s another story and rather uninteresting one.

Meanwhile, things have changed a bit. Tapestry had become a firsthand Apache Jakarta project, the Tapestry users list is more and more crowded, and again I see it used in my day work (by Teodor Danciu, one of my coworkers and incidentally author of Jasper Reports) and doing some moonlighting by myself for an older web project idea. And there is exceptional Eclipse support via Spindle plugin. While the ‘buzzword impact’ on Tapestry on a Java developer CV doesn’t yet measure up with Struts, this framework has obviously gained a lot of attention lately.

So, what’s so special about it ? If I’d have to choose only one small phrase I’d quote Howard Lewis Ship, Tapestry lead developer, from the preface of his book ‘Tapestry in Action’:

The central goal of Tapestry is to make the easiest choice the correct choice.

In my opinion this is the weight conceptual center of the framework. Everything, from the template system which has only the bare minimum scripting power, passing through the componentized model, up to the precise detailed error-reporting (quite unique feature in the opensource frameworks world) gently pushes you (the developer) to Do The Right Thing. To: put logic where it belongs (classes not templates), organize repetitive code in components, ignore the HTTP plumbing and use a real, consistent, MVC model in your apps (forms are readable and writable components of your pages). You don’t need to be Harry Tuttle to make a good Tapestry webapp, just a decent Java developer is enough. That’s more than I can tell about Struts …

Coming from a classic JSP-based webapp world, Tapestry is really a culture shock. The most appropriate way to visualise the difference is to imagine a pure C programmer abruptly passing to C++, into the objects world**. For a while, he will try to emulate the ‘old’ way of work, but soon enough he’ll give up and start coding his own classes. However, this C programmer will have to make some serious efforts, not necessarily because OOP is hard to leard, but in order to break his/her old habits.

“Tapestry in Action” is your exit route from the ugly world of HTTP stateless pages and spaghetti HTML intertwingled with Java code and various macros. It’ one of the best JSP detoxification pills available on the market right now.

The first part of the book (‘Using basic Tapestry components’) is nothing to brag about. It’s basically an updated and nicely organized version of the various tutorials already available via the Tapestry site, excepting probably some sections in chapter 5 (‘Form input validation’). By the way, the chapter 5 is freely downloadable on the Manning site and is a perfect read if you want a glimpse of the fundamental differences between Tapestry and a classic web framework (form validation being an essential part of any dynamic site). However, if you want to go over the ‘Hangman’*** phase you really need to dig into the next two book sections.

The second section ‘Creating Tapestry components’ is less covered by the documentation and tutorials. I’m specifically pointing here to the subsections ‘Tapestry under the hood’ (juicy details about pages and components lifecycle) and ‘Advanced techniques’ (there’s even an ‘Integrating with JSP’ chapter !). While it is true that any point from this chapter will generally be revealed by a search on Tapestry user list or (if you’re patient) by a kind soul answering your question on this same list, it’s nethertheless a good thing to have all the answers nicely organized on the corner of your desk.

The third and last chapter (‘Building complete Tapestry application’) is a complete novelty for Tapestry fans. It’s basically a thorough description of how to build a web application (a ‘virtual library’) from scratch using Tapestry. While the Jboss-EJB combination chosen by the author is not exactly my cup of tea (I’m rather into the Jetty+Picocontainer+Hibernate stuff) I can understand the strong appeal that it is suppposed to have among the J2EE jocks. Anyway, given the componentized nature of Tapestry, I should be able to migrate it relatively easily if I feel the need for it. The example app is contained in a hefty 1Meg downloadable archive of sources, build and deployment scripts included.

To conclude, ‘Tapestry in Action’ is a great book about how to change the way you are developing web applications. The steep learning courve is a little price to pay for a two or three-fold improvement in overall productivity. And this book should get you started really quick.

*Which AFAIK is still used at our former customer.
**There were some posts on Tapestry user list on about a certain conceptual ressemblance with Apple’s WebObjects. I can’t really pronounce upon this because I do not know WebObjects, but the name in itself is an interesting clue.
***’Hangman’ is the Tapestry ‘Petshop’ (although there is also a ‘real’ Tapestry Petshop referenced in the Wiki).

Written by Adrian

March 18th, 2004 at 2:56 pm

Posted in Books

Tagged with , , , , ,

Ant goodies : extracting info from Eclipse .classpath

leave a comment

IMPORTANT UPDATE: Please note that 'antclipse' is now part of the ant-contrib at Sourceforge, under Apache licence.

Original blogpost:

I hate duplicating information manually – besides, it's a known fact that duplication is classic code smell that tells you to refactor. This time it's not Java code, but something somewhat different : Ant used in Eclipse context. The issue here is that .classpath files generated by Eclipse have important information which is usually duplicated by hand in the build.xml script. SO many times I've changed libraries in my project in Eclipse just to discover that Ant task was broken…

There surely are some workarounds like the task written by Tom Davies but unfortunately:

  • It's an Eclipse plugin. I want to be able to build my project standalone, we don't need no stinkin' plugin.
  • I's rather old and with a Nazi style checking of tags so it pukes on my 3.0M3 complaining about a certain attribute of type “con” in the .classpath file (lesson learned: don't be picky about tags and attributes names, if you want the plugin to work with future versions of the software which produced the XML document, especially when you do not have a schema or DTD to rely on)
  • It's Friday evening, dark weather outside, I'm alone in the house and the TV is broken (and even if it worked, there's nothing to see on TV anyway). Boys and girls, let's write an Ant task !

From the documentation, it appears that writing an Ant task should be an easy task :) . And yes, it is, once you go past all the little idiosyncracies. Like mandatory “to” string in a RegexpPatternMapper, although all you want to do is matching, not replacing. Like having completely different mechanisms for Path and FileSet (I've always thought a Path is a “dumbed down” FileSet, but I was completely wrong, a fileset is somewhat “smarter” but it only has a single directory).

The result is here, and everything you have to do is to download and put the antclipse.jar (7kB) in your ant/lib library and you're set (just remember to refresh Ant classpath if you're launching Ant from Eclipse).

What does it do ? Well, it creates classpaths or filesets based on your current .classpath file generated by Eclipse, according to the following parameters :

Attribute Description Required
produce This parameter tells the task wether to produce a “classpath” or a “fileset” (multiple filesets, as a matter of fact). Yes
idcontainer The refid which will serve to identify the deliverables. When multiple filesets are produces, their refid is a concatenation between this value and something else (usually obtained from a path). Default “antclipse” No
includelibs Boolean, whether to include or not the project libraries. Default is true. No
includesource Boolean, whether to include or not the project source directories. Default is false. No
includeoutput Boolean, whether to include or not the project output directories. Default is false. No
verbose Boolean, telling the app to throw some info during each step. Default is false. No
includes A regexp for files to include. It is taken into account only when producing a classpath, doesn't work on source or output files. It is a real regexp, not a “*” expression. No
excludes A regexp for files to exclude. It is taken into account only when producing a classpath, doesn't work on source or output files. It is a real regexp, not a “*” expression. No

Classpath creation is simple, it just produces a classpath that you can subsequently retrieve by its refid. The filesets are a little trickier, because the task is producing a fileset per directory in the case of sources and another separate fileset for the output file. Which is not necessarily bad, since the content of each directory usually serves a different purpose. Now, in order to avoit conflicting refids each fileset has a name composed by the idcontainer, followed by a dash and postfixed by the path. Supposing that your output path is bin/classes and the idcontainer is default, the task will create a fileset with refid antclipse-bin/classes. The fileset will include all the files contained in your output directory, but without the trailing path bin/classes (as you usually strip it when creating the distribution jar). If you have two source directories, called src and test, you'll be provided with two filesets, with refids like antclipse-src and antclipse-test.

However, you don't have to code manually the path since some properties are created as a “byproduct” each time you execute the task. Their name is idref postfixed by “outpath” and “srcpath” (in the case of the source, you'll find the location of the first source directory).

A pretty self-explanatory Ant script follows (“xml” is a forbidden file type on jroller, so just copy paste it into your favourite text editor). Note that nothing is hardcoded, it's an adaptable Ant script which should work in any Eclipse project.

Created with Colorer-take5 Library. Type 'ant'
<?xml version="1.0"?>
<project default="compile" name="test" basedir="."> <taskdef name="antclipse" classname="fr.infologic.antclipse.ClassPathTask"/>
<target name="make.fs.output">
<!-- creates a fileset including all the files from the output directory, called ecl1-bin if your binary directory is bin/ -->
<antclipse produce="fileset" idcontainer="ecl1" includeoutput="true" includesource="false"
includelibs="false" verbose="true"/> </target>

<target name="make.fs.sources">
<!-- creates a fileset for each source directory, called ecl2-*source-dir-name*/ -->
<antclipse produce="fileset" idcontainer="ecl2" includeoutput="false" includesource="true" includelibs="false" verbose="true"/>
</target>

<target name="make.fs.libs">
<!-- creates a fileset sontaining all your project libs called ecl3/ -->
<antclipse produce="fileset" idcontainer="ecl3" verbose="true"/>
</target>

<target name="make.cp">
<!-- creates a fileset sontaining all your project libs called ecl3/ -->
<antclipse produce="classpath" idcontainer="eclp" verbose="true" includeoutput="true"/>
</target>

<target name="compile" depends="make.fs.libs, make.fs.output, make.fs.sources, make.cp">
<echo message="The output path is ${ecl1outpath}"/>
<echo message="The source path is ${ecl2srcpath}"/>
<!-- makes a jar file with the content of the output directory -->
<zip destfile="out.jar"><fileset refid="ecl1-${ecl1outpath}"/></zip> <!-- makes a zip file with all your sources (supposing you have only source directory) -->
<zip destfile="src.zip"><fileset refid="ecl2-${ecl2srcpath}"/></zip> <!-- makes a big zip file with all your project libraries -->
<zip destfile="libs.zip"><fileset refid="ecl3"/></zip>
<!-- imports the classpath into a property then echoes the property --> <property name="cpcontent" refid="eclp"/>
<echo>The newly created classpath is ${cpcontent}</echo>
</target>
</project>

TODOS : make “includes” and “excludes” to work on the source and output filesets, find an elegant solution to this multiple fileset/directories issues, and most important make it work with files referenced in other projects.

I am aware that the task is very far from being perfect, so just download it if you're interested, try to use it, try to break it, and tell me what you think and how it can be improved. Also, if you're interested in the source, just send me an email, but be aware that it's Friday evening beer-induced source code, nothing to be proud of… It was only tested it with Ant 1.5.x so YMMV. I assume no responsibility if you use it a production environment.

Written by Adrian

March 1st, 2004 at 5:17 pm

Posted in Tools

Tagged with , , ,