Custom Plug-ins

By: James Reynolds - Revised: 2006-06-06 devin

Introduction

Learn about custom plug-ins including how to create a custom plug-in and examples of custom plug-ins.

Section Links

Create Custom Plug-in
Custom Plug-in Example 1
Custom Plug-in Example 2
Agents Script
Source Code

Create Custom Plug-in

You can create your own job with any number and type of arguments with this plug-in. This plug-in is also saved. Most of this is pretty intuitive, but I'll just ramble on anyway. Most likely, you will want to just try it out and skip this section entirely.

To use it, choose "New Job" from the file menu and select "Create Custom Plug-in" in the "New Job" window that will pop up.

"New Plug-in Name" - First, name the plug-in anything you like. If you name the plug-in a name that you have used already, it will ask if you want to replace it. The plug-ins are stored at "~/Library/Xgrid/Plug-ins/Plugin_name.xgplug".

"Command" - Next, give it a command to execute. This command will be TAR'ed (stored in "~/Library/Xgrid/Plug-ins/_name_.xgplug/Contents/_name_.plist" in the "Tarred Executable" key value pair) and copied to all of the agents when you run the job. So if you have agents running Mac OS X 10.2 and Mac OS X 10.3, you can't specify a system binary like /bin/sh because the 10.2 and 10.3 binaries for /bin/sh are completely incompatible. So if you are using 10.3 and you specify /bin/sh as the command, the 10.3 /bin/sh from the computer you are using will be copied to all of the agents. So if you downloaded the Darwin source code and you actually changed your /bin/sh, that is what will be copied to all of the agents. It will NOT use the agent's own /bin/sh. To overcome this hurdle, simply use a script as the command! Or if you know C, you can always write your own binary command.

"Working Dir" - The "Working Dir" is a directory on your client that will be copied to each agent, and it will be where the shell's working directory is (the result of "pwd" on the agent will be this directory--see below for more information about the actual path on the agent).

You do not want to specify a directory with lots of contents. In fact, you should try to keep this as slim as possible. You must specify a working directory and a destination (see below) if your command creates files and you want those files back.

Concerning the path of the working directory and the command. Your command will be copied to a folder named /tmp/xgagent.yFd6RNZg, where "yFd6RNZg" is random and will always be different for each job. Your working directory will be in a similarly named folder, but it will not be the same folder. So if when a job runs with a working directory, you will have 2 folders, one containing the command, the other containing the working directory, like this:

/tmp/xgagent.yFd6RNZg (command)
/tmp/xgagent.bLaHB14h (working directory)

If you have a binary for your command, and it requires library files, you will either have install the libraries on each agent, or put them in the "Working Dir" folder and get your binary to look in a randomly named folder (/tmp/xgagent.????????/) for the libraries. I don't even know if that is possible.

Another way to get files to agents is for an agent job to run "/usr/bin/curl -O" to download files off of a webserver. I had to do this as I tested Xgrid because my jobs were too large and would timeout while transferring from the client to the controller. Of course, I would only do this if you had to.

"Argument n" - Specify the number and type of arguments. You have a few choices: "Literal", "Range", "Random", "File", and "Filelist".

"Literal" will use a literal value that you specify. You CAN NOT put spaces in the values here. If you do, you will crash the GridAgent on each agent (as of Tech Preview 1). That really sucks, so don't do it. If this is your only argument, then your job will only run on one agent.

"Range" will use numbers. You specify the start, end, and step values. This argument will create as many jobs as there are values. If you specify 1 to 10 by 1, you have 10 jobs. If you specify 1 to 10 by 2, you have 5 jobs: 1, 3, 5, 7, 9 (not sure if 10 will be a job or not, so maybe you have 6 jobs...?).

"Random" lets you specify how many jobs there are, and the range of random numbers. I do not know if there is duplication of numbers (as is normal with random number generators), or if it only uses a number once.

"File" copies one input file to all the nodes and uses it as an argument to the command (I haven't tested it, so I'm not sure that is what it does).

"Filelist" uses each file as a job. So if you have a folder with 10 files, you have 10 jobs. Each file will be sent to the agent (stored in the working directory). The command will receive the path of the file as an argument.

You can only have 251 files by default (Tech Preview 1--maybe this will be fixed in a later release). The reason why is that Unix kernels have a setting that limits the number of open files for each process, called "Open File Descriptor". On Mac OS X 10.2 and 10.3, that number is 256. You can easily extend that number using the Terminal.app.

However, doing increasing the open file descriptors may not really help because I noticed that somewhere between 100 and 200 files, when I submit a job, it generates a timeout error (in Tech Preview 1). So unless a later release of Xgrid extends the timeout, changing the number wont help much. But in case Apple extends the timeout, and doesn't fix the mac open file descriptors, this is how you increase it on your own. First, quit Xgrid.app.

Then if you are using tcsh as your default shell:

limit descriptors 512
/Applications/Xgrid.app/Contents/MacOS/Xgrid

If you are using sh as your default shell:

ulimit -n 512
/Applications/Xgrid.app/Contents/MacOS/Xgrid

Or replace "512" with a number greater than the number of jobs you have.

You will have to leave that Terminal window open while you have Xgrid open, or Xgrid will quit (without warning).

"Stdin File" - If your command needs input from STDIN, you can specify a file to be typed to your command. In other words, when your command runs on the agents, the contents of this file will be "typed" for you.

"Destination" - If you expect some sort of output from your command, you must specify somewhere to store that. Each job has one output file. The default destination is ~/Desktop, but if you don't change it, you can easily create a file mess on your Desktop, especially if you had over 100 jobs! Just make an empty folder and specify that as your destination.

The stdout and stderr of jobs are streamed (separately) to the client and stored (together) in the ouptut file. Each file will be named the command followed by each argument separated by an underscore. So a command /bin/sh with the arguments "-c" and "hostname" and a range of arguments will have an output files that look like:

sh_-c_hostname_1.txt
sh_-c_hostname_2.txt
sh_-c_hostname_3.txt
...

Also, if you specified a working directory, the working directory on the agents is copied (dittoed is the more exact behavior) back to the client to the destination folder. Any files that are created on agents should be unique for each job to make sure it isn't replaced by a file with the same name from a different agent.

Also, as of Tech Preview 1, if jobs create alot of large files, it may crash the controller. You will know the controller is crashed if your Xgrid.app is asking to connect to a controller. To work around this, my agents uploaded their files to a webserver. See the Xgrid POV-Ray movie page for an explanation of that.

Submit job/Create New Plug-in - Finally, submit the job to test it or create it to store it. If you create the job, and close the "Create Custom Plug-in" window, you can change options like arguments, working directory, destination, etc.,. However, you will have to start over if you want to change the command that is executed.

Custom plug-ins are stored in ~/Library/Xgrid/Plug-ins. You can move plugins to that folder to make them available in the job list, or you can delete them to remove them from the list.

Custom Plug-in Example 1

Here is a plug-in that will tell you how fast your agents are. It executes the command "/usr/bin/system_profiler SPHardwareDataType n" where "n" is replaced with a number (and is ignored by /usr/bin/system_profiler. Specifying "SPHardwareDataType" only works on Mac OS X 10.3.

I used the argument range 1 to 23 because I knew I had 23 agents. If I had used a smaller range, not all of the agents would have executed the command. If I specified a larger range, several agents would have executed the job twice.

Because I am specifying a system binary as the command, it will run only on one version of the OS (10.2 system binaries are not compatible with 10.3 system binaries). A work around is to use a script, like this:

#!/bin/sh

system_profiler SPHardwareDataType

Custom Plug-in Example 2

Here is a plug-in that will run "/bin/sh -c hostname n" where "n" is replaced by a number. The command "/bin/sh -c" will execute the next argument, which is "hostname". The number "n" is ignored by "/bin/sh -c" (or at least it doesn't cause an error). I did this because if I used "/bin/hostname" as the command and then a range of values, hostname complains saying: "hostname: sethostname: Operation not permitted" (it was trying to set the hostname to the number "n").

Again, because I am specifying a system binary as the command, it will run only on one version of the OS (10.2 system binaries are not compatible with 10.3 system binaries). A work around is to use a script, like this:

#!/bin/sh

hostname

In this example, I wasn't sure how many agents I had. So I used a range that was larger than the number of agents I had installed. I find the exact number of agents with a post script. After I run the grid job, I cd to the output directory, create a script named "countagents" and run it like this "./countagents sh_-c_hostname_". Here is the script:

#!/bin/sh

if [ "$1" = "" ]; then
echo Usage: $0 \<filename excluding variable portion\>
echo Example: $0 sh_-c_hostname_
exit
fi

cat $1*.txt > /tmp/rollcall_all_list
sort -t* /tmp/rollcall_all_list > /tmp/rollcall_sorted_list
uniq /tmp/rollcall_sorted_list > /tmp/rollcall_small_list
echo Total Jobs
cat /tmp/rollcall_all_list | wc -l
cat /tmp/rollcall_small_list
echo Total Agents
cat /tmp/rollcall_small_list | wc -l
#rm /tmp/rollcall_all_list /tmp/rollcall_sorted_list /tmp/rollcall_small_list

Here is a picture of the output:

Agents Script

Here is a script that will tell you all kinds of information about your agents. This is absolutely essential to understand what is happening on the agents when a job is running. Every job I submit, I always run this before I run the actual job, so that I can use it to debug the job if something goes wrong.

#!/usr/bin/perl

machine_info();

sub machine_info {
print "---------------------------------------------------- ";
print "The date is: ";
system "/bin/date";
print "---------------------------------------------------- ";
print "My hostname is: ";
system "/bin/hostname";
print "---------------------------------------------------- ";
print "This computer is: ";
system "/usr/sbin/system_profiler SPHardwareDataType";
print "---------------------------------------------------- ";
print "My arguments are: @ARGV ";
print "---------------------------------------------------- ";
print "My executable path is: $0 ";
print "---------------------------------------------------- ";
print "I am currently in this directory: ";
system "/bin/pwd";
print "---------------------------------------------------- ";
print "This is all the perl processes running: ";
system "/bin/ps -awwx | grep perl";
print "---------------------------------------------------- ";
print "This is my swap files (/var/vm): ";
system "/bin/ls -l /var/vm";
print "---------------------------------------------------- ";
print "This is the contents of all the sub folders where I am located: ";
system "/bin/ls -Rl";
}

I also run

system "/bin/ls -l /var/vm";

after the job so that I can see how much impact it had. If alot of swap file exist after running an Xgrid job, this will slow down the computer until it is restarted.

Source Code

The source code for the Shell plug-in is located on the Xgrid.dmg. It is not installed by the installer. There is also an Xcode template that can be copied to Xcode template folder and when you select "New Project" in Xcode, it will have an option for "Xgrid plug-in".

The focal point of the Xgrid source code is the command that submits a job. That function is "submitJobWithParameters". It takes an NSDict. Here is a plist representation of the objects submitJobWithParameters expects:

<dict>
        <key>CommandLineDictionaries</key>
        <array>
            <dict>
                    <key>Command</key>
                    <string>/bin/sh</string>
                    <key>Inputs</key>
                    <dict>
                            <key>StandardInput</key>
                            <data>hostname</data>
                    </dict>
            </dict>
        </array>
<dict>

I should say, I think this is what submitJobWithParameters expects. I didn't triple check the source code to make sure, but that is what it appeared to be.

Also, the value for StandardInput key should be UTF 8 encoded.

And I believe the value for the "Command" key is copied to the agents. In the above example, "/bin/sh" on the client is copied to each agent.

Also, the above object will only generate one job. If you want more than one job, the NSArray must have more than one value. I *think*, this:

<dict>
        <key>CommandLineDictionaries</key>
        <array>
            <dict>
            ... job 1
            </dict>
            <dict>
            ... job 2
            </dict>
            <dict>
            ... job 3
            </dict>
        </array>
<dict>

submitJobWithParameters is declared in the GridPlugs.framework (/Library/Xgrid/Frameworks/GridPlugs.framework) in the file XgJobViewController.h.

These are some other interesting commands in that header file:

submitJobWithParameters
setJob
jobItemDidFinish
jobItemDidFail
jobItemDidCancel
jobItemLatestJobResultsDidArrive
updateStatusField

And finally, if you are going to be writing a plug-in, you should also look at the GridPlugs and BEEP frameworks. These header files:

GridPlugs.framework

GridPlugs.h
XgJobDocumentProtocol.h
XgJobPane.h
XgJobProtocol.h
XgJobViewController.h
XgTarWrapper.h

BEEP.framework

BEEP.h
BEEPChannel.h
BEEPError.h
BEEPMessage.h
BEEPSession.h
BEEPSessionAcceptor.h
BEEPSessionConnector.h
BEEPSSLContext.h