Multi-threading in iCluster
Customers occasionally ask about how iCluster handles multi-threading. Of course the standard answer depends on what is meant by multi-threading. There is multi-threading built into the Power chips at the hardware level called SMT or the newer SMT4 for Simultaneous Multi-Threading – in essence what SMT enables is the concurrent execution of the instruction stream of multiple threads on the same core. A great resource document that describes hardware feature is here:
This type of multi-threading is outside the realm of an application program.
At the application level, multi-threaded programming was introduced back in V4R2 of IBM OS/400. This is the ability for an application program (initially it was C and JAVA only but this was expanded to COBOL and RPG in V4R4) to spawn additional simultaneously executing paths of instructions. As an example you may have a payroll application that in its old single threaded days would take in a new employee name, update the employee file, create a new user profile for the employee, create a new EMAIL account, generate new forms for the employee to sign and then update the HR files. This would all be done sequentially and may take hours in an overnight batch process. In a multi-threaded world, the update to the employee file would spawn a parallel process that would handle the creating of the EMAIL account and user profile, generate the forms and update the HR file and then when this parallel thread has ended, rejoin the original thread to finish off by notifying HR.
You print out the call stacks for multi-threaded jobs using the command WRKJOB JOB(######/user/jobname) OUTPUT(*PRINT) OPTION(*PGMSTK) SLTTHD(*ALL).
iCluster jobs are not multi-threaded at this level. If you look at the job description ICLUSTER/CSJOBD (which is used for all submitted iCluster jobs) the value Allow multiple threads is set to *NO. However, iCluster does reach “parallelism” by scraping objects and applying objects in different journals by allowing multiple “jobs” that work under the same group but different journals.
So what customers are usually referring to is the ability to run multiple apply processes for a single group. This has been in the product for a long time and the number of threads being run is determined by the number of journals being scraped by either iCluster or the OS if remote journaling is being used. For each scraped journal there will be a pair of jobs (HADDJS and HADSFP) on the source and on the target three jobs for each apply process (HADDJSR, HADSFPR and HADTUP). So by increasing the number of journals in the source group, or even isolating a single file to a single journal you can take advantage of multiple parallel apply jobs per group.
So in essence iCluster *does* use multi-threading in the sense that every journal being scraped has its own “thread”, and every group has its own threads for each of the journals that are being scraped. So the customer has the ability to set up as much multi-threading as they want through the use of multiple groups and journals.