Once I needed a fonctionality in DataStage, it was not there and a JavaTransformer was not an option (plus I hate PX functions)…
This is just a braindump of how to create a DS Build Stage, no « analytics » whatsoever.
So, even if DataStage is a precursor of most of BigData (and actually used map-reduce 20 years before the Google’s paper was published), certain functions are not there, but it is not overly complicated to add them. Even if my knowledge of C/C++ is quite rough, it wasn’t too difficult.
To create a new stage (build type), just go through the menu (it’s IBM!):

…but actually before creating a stage, one needs to create/import stage’s in/out table definitions (hopefully you know how to do that).

Now, we’re ready to create the stage:

I didn’t want to produce any icons, nothing fancy, so I started building: importing the interface definitions we’ve just created:


In my case I needed to add one « include » (DataStage’s includes are automatic, from what I gather):

And… I had to produce the following code (I’m just checking a string against a rgexp).
The only probem was to extract the (char*) from DS’s string type. Poking around the « includes » in DS install directory I’ve found the function terminatedContent(),
(IBM\InformationServer\Server\PXEngine\include\apt_util\basicstring.h)

which I’m using here:

Compilation. I’ve figured that without « Verbose » it’s impossible to debug the thing:

So, now we can click generate… and if the log looks like the one I’ve included, everything builds Ok:


…and you should see a new stage in the Parallel Palette (restart of the DS client may be needed):

That’s all, folks.