#!/usr/bin/env tclsh
# Set the host and port of the LDAS manangerAPI
set hostName "ldas.ligo-wa.caltech.edu"
set port 10001
# Set the user command
set ldasCmd "ldasJob\
{-name myname -password blah -email me@ligo.caltech.edu}\
{getMetaData -returnprotocol http:out.xml -outputformat LIGO_LW\
-sqlquery {select tabname from syscat.tables} }"
#---- Code below here is the same for all jobs ----
# Open a socket to the LDAS managerAPI
if { [catch {socket $hostName $port} sockId] } {
puts "Unable to connect to LDAS manager"
exit
}
# Send the LDAS command
puts $sockId $ldasCmd
flush $sockId
# Get the response from the managerAPI and print it out
set jobInfo [read $sockId]
puts $jobInfo
# Close the socket connection and exit
close $sockId
The script above starts the LDAS job and then exits, without waiting
for the LDAS job to finish. It is admirably straightforward, but
there are some drawbacks:
To address these issues and more, the ldasjob LIGOtools package has been designed to provide a more robust, flexible, and user-friendly interface for running LDAS jobs from within scripts. Features include:
Before describing the ldasjob library routines in detail, it
is worth considering a few examples.
A simple example
Using the ldasjob library, the example script from the Introduction can be rewritten as follows:
#!/usr/bin/env tclshexe
package require ldasjob
LJrun job1 -manager lho {
getMetaData -returnprotocol http:out.xml -outputformat LIGO_LW
-sqlquery {select tabname from syscat.tables}
}
puts $job1(jobInfo)
The first line of the script, #!/usr/bin/env tclshexe, has the
effect of starting up the LIGOtools version of the Tcl shell
interpreter to execute the rest of the script. The second line of the
script, package require ldasjob,
loads library code to define several new Tcl procedures. One of these
is LJrun, which takes arguments
to indicate the LDAS user command to be executed and the LDAS manager
to which it should be sent (lho
in this case, which is acceptable shorthand for the LDAS system at
LIGO Hanford Observatory), and associates the job with a "tag" (job1). Note that only the "core" LDAS
user command is specified; the library reads the user's LDAS username
and password from the file ~/.ldaspw, where they were previously
stored using the ldaspw utility.
There also is no email address in the script; the LJrun
command simply "blocks" while the LDAS job runs, and returns when the
LDAS job finishes, without any email being sent to the user. Finally,
this example script executes puts $job1(jobInfo) to print out
the message from LDAS containing information about the job which was
submitted.
#!/usr/bin/env tclshexe
package require ldasjob
set table "syscat.tables"
LJrun job1 {
getMetaData
-returnprotocol http:out.xml
-outputformat LIGO_LW # This is "LIGO lightweight" format,
# which is based on XML
# -sqlquery {select tabname,tabschema from $table}
-sqlquery {select tabname from $table}
}
if $LJerror {
puts "LDAS job failed! Error message is:"
puts $job1(error)
exit 1
}
puts "LDAS job $job1(jobid) succeeded. Reply message from LDAS:"
puts $job1(jobReply)
The most significant new feature in this example script is the test to see
whether the LDAS job succeeded or failed, by checking the value of the
LJerror variable after executing
the job. If the job succeeded, then the "reply" message from LDAS
(i.e. the message normally sent by email, stating that the job
has finished and giving the location of the output, if any) is
printed.
In this example, the LJrun command does not specify the LDAS manager to which the job should be sent; therefore, this will be determined from the environment variable LDASMANAGER, or by using the '-manager' command-line option when executing the script. This allows you to submit your job to one LDAS installation or another without modifying your script.
This example also shows that the LDAS user command, enclosed in curly braces, can contain extra spaces and newlines, as well as comments beginning with the "#" character. These comments are removed, and all newlines are converted to ordinary spaces, before the user command is actually sent to LDAS.
Finally, note that Tcl variable substitution (in this case, for '$table'), as well as command substitution (inside square brackets), is performed on the LDAS user command before it is submitted.
You should definitely use the tclshexe shell, which is part of LIGOtools, rather than the ordinary tclsh shell which might be installed on your computer. tclshexe is a fully functional version of the Tcl shell which has been specially compiled to automatically check for Tcl libraries (such as the ldasjob library) in $LIGOTOOLS/lib. In the future, it might also contain LIGO-specific extensions to the core Tcl language, for instance to implement encrypted communications for submitting LDAS jobs.
The ldasjob library may be loaded in either of two ways,
differing only in capitalization:
package require ldasjob
or
package require LDASJob
When the ldasjob package is loaded, it does a few things with command-line arguments (if any) and environment variables for the convenience of the script author:
#!/usr/bin/env tclshexe
package require ldasjob
#-- Make sure that the required arguments were specified
if { ${#argv} != 2 } {
puts "Usage: myscript <LDAS manager> <column list>"; exit 0
}
LJrun job1 -manager $1 {
getMetaData -returnprotocol http:out.xml -outputformat LIGO_LW
-sqlquery {select $2 from syscat.tables}
}
if $LJerror {
puts "LDAS job failed! Error message is:"
puts $job1(error)
exit 1
}
puts "LDAS job $job1(jobid) succeeded. Reply message from LDAS:"
puts $job1(jobReply)
LJrun <job_tag> [<options>] <LDAS user command>Note that the job tag must be the first argument to LJrun, while the LDAS user command must be the last argument. The available options are:
-manager <LDAS manager> -user <LDAS username> -log -log <command> -email <email address> -email -email <host>:<port> -nowaitIf the LDAS job finishes successfully (or if LJrun was called with the -nowait option and the job was submitted successfully), then LJrun sets LJerror=0 and returns the LDAS job ID. If the LDAS job fails, then LJrun sets LJerror=1 and returns the error message. If a software error occurs, then LJrun sets LJerror=1, returns the error message, and generates a Tcl error condition which causes the user script to terminate (unless caught and handled).
Tcl variable substitutions (indicated by dollar signs) and command substitutions (indicated by square brackets) are performed before the user command is sent to LDAS. These substitutions are performed in the scope of the routine which calls LJrun. For example, the following script retrieves 20 seconds of frame data for one Hanford channel:
#!/usr/bin/env tclshexe
package require ldasjob
set channel H2:LSC-AS_Q
set time1 693960025
set length 20
# Calculate the end time for the query. Note that the LDAS convention
# is to INCLUDE a full second of data beginning at the end time
# specified by the user. Thus, to get exactly $length seconds of
# data, we must add ($length-1) to the start time.
set time2 [expr $time1+($length-1)]
LJrun datajob -manager lho {
getFrameData
-returnprotocol http:out.gwf
-outputformat frame
-framequery { R [string index $channel 0] {} $time1-$time2 Adc($channel) }
# "[string index $channel 0]" returns the first letter of the
# channel name, i.e. the detector site code
}
if $LJerror {
puts "LDAS job failed! Error message is:"
puts $datajob(error)
exit 1
}
puts "LDAS job $datajob(jobid) succeeded. Output files are:"
puts $datajob(outputs)
Note that backslashes appearing in the user command are not
treated as special characters. For instance, if your user command
contains the string H2\:LSC-AS_Q::AdcData:693960025:0:Frame,
the backslash will be retained when the user command is sent to LDAS.
(A corollary is that you cannot suppress variable substitution
by preceeding the dollar sign with a backslash, nor can you suppress
command substitution by preceeding the square bracket with a
backslash. I think this is OK, since I don't know of any situation in
which the user command sent to LDAS should contain a dollar sign or
square bracket; let me know if you encounter such a situation.)
If you prefer, you may store the LDAS user command in a Tcl variable and then just pass this variable as an argument to LJrun. Variable and command substitutions are still performed (at the time that LJrun is called) in this case. For example:
...
set ldascmd {
getFrameData
-returnprotocol http:out.gwf
-outputformat frame
-framequery { R [string index $channel 0] {} $time1-$time2 Adc($channel) }
# "[string index $channel 0]" returns the first letter of the
# channel name, i.e. the detector site code
}
LJrun datajob -manager lho $ldascmd
...
There are four mechanisms for communicating this string to LJrun. From highest to lowest precedence:
#!/usr/bin/env tclshexe package require ldasjob setenv LDASMANAGER lho ...
Caveat: due to the fact that the library creates an independent "helper process" to take care of communicating with LDAS, the thing that matters is the value of the LDASMANAGER environment variable at the time your script makes its first call to LJrun. Changing LDASMANAGER after this point will have no effect on subsequent jobs. If your script needs to submit jobs to different LDAS systems, it should call LJrun with the -manager option to specify where each job should be sent.
Alternatively, you can provide your own command after the -log which will be executed instead of printing the default log message. This command is executed in the scope of the routine which calls LJrun, and in principal can be anything, not just a logging command. As a convenience, the command can use "this" to refer to the job info array for the job just submitted, rather than having to use the actual job tag. For example:
... -log {puts "Job ID is $this(jobid)"} ...
prints the job ID of the LDAS job which has just been started, while
... -log "puts $fid \"Job ID is \$this(jobid)\"; flush $fid" ...prints the same message to a file opened with file descriptor $fid. (The quoting in the latter example causes $fid to be substituted in the scope of the routine which calls LJrun, before the command is passed to LJrun.)
If you specify an email address after the -email flag, that address will be used. If you do not specify an email address after the -email flag, then the email address is taken from the LDASEMAIL environment variable; an error occurs if it is not set.
To support certain special applications, you may specify a server socket address (in the form <host>:<port>) instead of an ordinary email address. In this case, when the job is finished, LDAS will connect to this socket and transmit the message rather than sending it by email.
...
proc RunIt {timerange channel} {
#-- Delete the job tag if it already exists (if not, LJdelete just returns)
LJdelete datajob
LJrun datajob -manager lho {
getFrameData
-returnprotocol http:out.gwf
-outputformat frame
-framequery { R [string index $channel 0] {} $timerange Adc($channel) }
# "[string index $channel 0]" returns the first letter of the
# channel name, i.e. the detector site code
}
}
RunIt 693960000-693960064 H1:LSC-AS_Q
if $LJerror {
puts "H1 job failed! Error message is: $datajob(error)"
exit 1
}
set h1file $datajob(outputs)
RunIt 693960000-693960064 H2:LSC-AS_Q
if $LJerror {
puts "H2 job failed! Error message is: $datajob(error)"
exit 1
}
set h2file $datajob(outputs)
...
However, there is one subtlety: LJerror and job info arrays
are automatically brought into scope in parent routines, but not in
other subroutines. To access them in other subroutines, you must
explicitly bring them into scope with the Tcl "global" command. (It is safe to do
this even if they have already been brought into scope.) This is done
in the "CheckIt" routine in the following example:
...
proc RunIt {timerange channel} {
#-- Delete the job tag if it already exists (if not, LJdelete just returns)
LJdelete datajob
LJrun datajob -manager lho {
getFrameData
-returnprotocol http:out.gwf
-outputformat frame
-framequery { R [string index $channel 0] {} $timerange Adc($channel) }
# "[string index $channel 0]" returns the first letter of the
# channel name, i.e. the detector site code
}
}
proc CheckIt {jobname} {
global LJerror datajob ;#-- Needed to bring these into scope
if $LJerror {
puts "$jobname job failed! Error message is: $datajob(error)"
exit 1
}
}
RunIt 693960000-693960064 H1:LSC-AS_Q
CheckIt H1
set h1file $datajob(outputs)
RunIt 693960000-693960064 H2:LSC-AS_Q
CheckIt H2
set h2file $datajob(outputs)
...
Errors intrinsic to the software (e.g. syntax or logic errors in the user script, or internal software errors in the library code) will also cause LJerror to be set equal to 1, but more importantly, they will cause a Tcl error condition. Normally, this will cause the user script to terminate with an informative error message and stack trace. It is possible to use Tcl's catch command to ignore such an error or to handle it "gracefully", although it is generally preferable to write the code so as to avoid generating the error in the first place. For example, considering the following code from a user script:
...
LJrun job1 {
getMetaData -returnprotocol http:out.xml -outputformat LIGO_LW
-sqlquery {select tabname from syscat.tables}
}
puts "Output http directory is $job1(outputDir)"
...
If the LDAS job fails, then the outputDir element of the
array will not be set, and a Tcl error will be generated when the
script tries to access it. It would be better to modify the script to
make sure that the job succeeded before trying to read that element of
the array, e.g.:
...
LJrun job1 {
getMetaData -returnprotocol http:out.xml -outputformat LIGO_LW
-sqlquery {select tabname from syscat.tables}
}
if $LJerror {
puts "Job failed!"
} else {
puts "Output http directory is $job1(outputDir)"
}
...
The complete list of array elements is shown below. Element names are case-sensitive. Note that not all elements will be set if a job fails (or if the job was submitted with the -email option), and attempting to read an array element which has not been set will cause a Tcl error condition.
| Array element | Description |
|---|---|
| jobtag | The user-assigned job tag associated with this job |
| cwd | The current working directory at the time the job was submitted |
| unixHost | The hostname of the computer which submitted the job |
| unixUser | The unix username which submitted the job |
| startTime | A date/time string indicating when the job was submitted |
| startTimeS | The unix system clock value (seconds since 1970) when the job was submitted |
| command | The LDAS user command sent to the manager (after comments have been removed and substitutions have been performed) |
| user | The LDAS username used to submit the job to LDAS |
| If LJrun was called with the -email option, then this contains the actual email address sent to LDAS. Otherwise it is not set. | |
| manager | The address of the LDAS managerAPI to which the job was sent, in the form "<host>:<port>", e.g. "ldas.ligo-wa.caltech.edu:10001" |
| managerIP | Just the Internet address of the LDAS managerAPI to which the job was sent, e.g. "ldas.ligo-wa.caltech.edu" |
| managerPort | The LDAS managerAPI port number to which the job was sent |
| ljproxy | If an LDAS job proxy server was used as an intermediary when submitting the job, then this array element will contain the address of that proxy server in the form "<host>:<port>". If a proxy server was not used, then this array element will not be set. |
| inputs | A list of input files transmitted to LDAS as part of job execution. If there were no inputs, then this will be an empty list. |
| jobInfo | The full text of the message from LDAS stating that the job is running and giving the job ID |
| jobid | The LDAS job ID, e.g. "NORMAL1234" |
| jobnum | The numeric part of the job ID, e.g. "1234" |
| LDASVersion | The version number of the LDAS software running on the LDAS system to which the job was submitted |
| status | The status of the job, which can be "submitted", "running", "done", or "error". |
| done | Equal to 1 if the job has finished (either successfully or with an error), 0 otherwise. However, if LJrun was called with the -email option, then this element will be set to 1 as soon as the job is submitted and LJrun returns. |
| error | An error message if the job failed, or an empty string if the job succeeded |
| jobReply | The full text of the message sent by LDAS when the job finished |
| outputs | A list of URLs from which the outputs from the job can be retrieved. If the job produced no outputs, then this will be an empty list. |
| outputDir | The http directory in which the outputs are located. If the job produced no outputs, then this array element will not be set. |
| jobTime | The total execution time of the job in seconds, as reported by LDAS |
| endTime | A date/time string indicating when the job finished |
| endTimeS | The unix system clock value (seconds since 1970) when the job finished |
| metadataTime etc. | The amount of time the job spent in the metadataAPI, in seconds. Similar array elements may include frameTime, ligolwTime, datacondTime, mpiTime. Times will be reported only for those APIs which were involved in the execution of the job. |
| errorAPI | If the job ended with an error, this indicates which LDAS API flagged the error, e.g. "metadata" for the metadata API. If the job did not end with an error, then this array element will not be set. |
Reading any info array element (except the jobtag element) causes the global LJerror variable to be set to 0 if the job succeeded (or at least has not failed so far), or to 1 if the job has failed.
An alternative way to check the status of a job is to use the LJstatus command, which takes a job tag as its argument. [LJstatus job1] is equivalent to $job1(status).
Technical note: in version 1.0 of the ldasjob package, array elements were copied from the helper process only when they were read by the client script, so that executing "array get ..." on a job info array would return only those elements which had already been read. This behavior was changed in version 2.0 of the ldasjob package; now, the job info array is filled as completely as possible after each call to LJrun, LJwait or LJfill. The LJfill command should not generally be needed now, although it could be useful in certain cases. (For example, passing a job tag to LJfill causes LJerror to be set to indicate whether the job succeeded or failed; this could be handy in a script which has to keep track of multiple jobs at the same time.)
LDAS always reports the location of job outputs as URLs with Internet IP addresses. If the client program actually connected to LDAS via a private network, then it may not be able to connect to the web server at the Internet IP address. In this situation, the ldasjob code modifies the URLs in the output and outputDir array elements, replacing the Internet IP address with the private-network address of the manager. This relies on the assumption that the web server is running on the same machine as the manager API.
... # This assumes that the job is known to have produced exactly one output file, # so that the "outputs" list contains just one item set url $job1(outputs) set contents [LJread $url] puts "Output file size is [string length $contents] bytes" ...
...
puts "Job produced [llength $job1(outputs)] output file(s)"
#-- Copy all of the output files to local disk
set destDir "/home/pshawhan/outputs"
foreach url $job1(outputs) {
set locFile [LJcopy $url $destDir]
puts "Created file $locFile"
}
...
The example above will terminate the script if an error occurs while
retrieving a file. The code inside the foreach loop may be modified
as follows to handle errors more gracefully:
...
puts "Job produced [llength $job1(outputs)] output files"
#-- Copy all of the output files to local disk
set destDir "/home/pshawhan/outputs"
foreach url $job1(outputs) {
if [catch {LJcopy $url $destDir} locFile] {
#-- An error occurred, so "locFile" now contains the error message
set errorMessage $locFile
puts "Error while copying $url: $errorMessage"
} else {
puts "Created file $locFile"
}
}
...
...
puts "Job's output directory is $job1(outputDir)"
foreach item [LJreaddir $job1(outputDir)] {
puts "Found item $item"
puts "Complete URL for item is $job1(outputDir)/$item"
}
Same thing, but with error handling:
...
puts "Job's output diretory is $job1(outputDir)"
if [catch {LJreaddir $job1(outputDir)} itemlist] {
puts "Output directory $job1(outputDir) does not actually exist!"
exit 1
}
foreach item $itemlist {
puts "Found item $item"
puts "Complete URL for item is $job1(outputDir)/$item"
}
#!/usr/bin/env tclshexe
package require ldasjob
#-- Check whether user specified all needed command-line arguments
if { ${#argv} != 3 } {
puts "Usage: frame2ilwd "
puts "Example: frame2ilwd H-657968401.F H0:PEM-LVEA_SEISX myout.ilwd"
puts "Note: you must either set LDASMANAGER or use the '-manager' option"
exit 1
}
#-- Run the LDAS job
LJrun job1 {
concatFrameData
-returnprotocol http://daq -outputformat {ilwd ascii}
-framequery { {} {} %FILE($1) {} Adc($2) }
}
if $LJerror { puts "LDAS job error:\n$job1(error)"; exit 3 }
#-- Retrieve the output
set url [lindex $job1(outputs) 0]
set gotfile [LJcopy $url $3]
puts "Retrieved $gotfile"
This feature is provided by the "helper process"
that is started when you call LJrun. It replaces the
%FILE(...) in the user command with an obscure URL,
then acts as a (highly restricted) web server to deliver the file to
LDAS when LDAS requests that URL. You can use %FILE(...) as
many times as you want in any given LDAS job.
set info [exec wc $file] scan $info %d%d%d lines words chars puts "File $file contains $lines lines"
exec can execute a pipeline, so that another way to count the number of lines in a file would be:
set lines [exec wc $file | cut -c1-8 ] puts "File $file contains $lines lines"
In general, it is good practice to use "catch" to handle any error which might occur while executing the external program(s). Here is a more careful version of the example just above:
if [catch {exec wc $file | cut -c1-8 } lines] {
#-- If an error occurs, exec returns whatever was written to stderr
set errmsg $lines
puts "Error occurred while counting lines: $errmsg"
} else {
puts "File $file contains $lines lines"
}
#!/usr/bin/env tclshexe
package require ldasjob
#-- Run a gravitational-wave burst search
LJrun search {
dataPipeline
...
}
if $LJerror {puts "LDAS error from dataPipeline job:\n$search(error)"; exit 3}
#-- Copy LDAS job ID into a scalar variable for convenience
set searchjob $search(jobid)
#-- Do a database query to retrieve all the event candidates from the
#-- search job to a local file. I figured out what SQL query to use by
#-- building a query like this with guild, then basically cutting and pasting.
LJrun getmeta {
getMetaData -returnprotocol http://out.xml -outputformat LIGO_LW
-sqlquery {
SELECT * FROM SNGL_BURST
WHERE ((process_id,creator_db) in
(select distinct process_id,creator_db from process
where (jobid = $searchjob)))
ORDER BY start_time, start_time_ns
}
}
if $LJerror {puts "LDAS error from getMetaData job:\n$getmeta(error)"; exit 3}
#-- Retrieve the output from the getMetaData job to a local file
set file [LJcopy $getmeta(outputs) ${searchjob}_events.xml]
#-- Count the number of events in the file using the 'lwtscan' utility
set report [exec lwtscan $file]
#-- The number of rows appears at the end of the report from lwtscan
regexp {(\d+) rows$} $report match nrows
#-- Print out the number of events found
puts "Job $searchjob found $nrows event candidates"
Another type of script involves a loop, with basically the same job
being executed each time through the loop, but with slightly different
parameters. In this case it is probably best to "delete" the job
(using the LJdelete function) at
the end of the loop, after which the job tag can be re-used. (The
alternative approach of constructing a distinct job tag each time
through the loop would also work, but the script would gradually
consume more and more memory.) This kind of script might have the
following structure:
#!/usr/bin/env tclshexe
package require ldasjob
#-- Initialize parameters, etc.
set start 693960000
set length 60 ;#-- Length of time to be analyzed by a single job
file delete loop.end ;#-- Delete this file if it exists
#-- Loop until a file called "loop.end" appears in the current directory
#-- (a kludgy but effective way for the user to cause a graceful exit)
while { ! [file exists loop.end] } {
#-- Construct the time range for this loop iteration
set trange "$start-[expr $start+($length-1)]"
#-- Run the LDAS job to analyze this time range
LJrun loopjob -log {
...
}
#-- Now do some post-analysis of the job (or whatever)
...
#-- Clean up at the end of the loop
LJdelete loopjob ;#-- Forget about this job
incr start $length ;#-- Increment the time range
}
puts "Exited loop because a file called loop.end appeared"
You can safely use LJdelete to "delete" a job at any time,
even if you submitted the job with the -email or
-nowait option and it might still be running within LDAS.
LJdelete does not instruct LDAS to cancel a job that
is running or queued; it simply causes the ldasjob library
software to forget about that job.
There are two ways to get the results (success/failure status, list of output files, etc.) from a job that was submitted using the -nowait option.
First, you can call LJwait <job_tag>, which returns when the job finishes either successfully or unsuccessfully. (If the job has already finished, LJwait returns immediately.) If the LDAS job finished successfully, then LJwait sets LJerror=0 and returns the LDAS job ID. If the LDAS job failed, then LJwait sets LJerror=1 and returns the error message. If a software error occurs, then LJwait sets LJerror=1, returns the error message, and generates a Tcl error condition which causes the user script to terminate (unless caught and handled). Here is part of a script which runs jobs simultaneously at LHO and LLO, then uses LJwait to wait for them to finish:
#!/usr/bin/env tclshexe
package require ldasjob
set query {
select * from sngl_inspiral
where end_time between 693960000 and 693965000
order by end_time, end_time_ns
}
#-- Submit jobs to run simultaneously, using the "-nowait" option
LJrun lhojob -mananger lho -log -nowait {
getMetaData -returnprotocol http://out.xml -outputformat LIGO_LW
-sqlquery $query
}
LJrun llojob -mananger llo -log -nowait {
getMetaData -returnprotocol http://out.xml -outputformat LIGO_LW
-sqlquery $query
}
#-- Wait for both jobs to finish
LJwait lhojob
if $LJerror { puts "LHO job failed: $lhojob(error)"; exit 3 }
LJwait llojob
if $LJerror { puts "LLO job failed: $llojob(error)"; exit 3 }
#-- Now retrieve the output files from each job and compare them
...
Note that the -log option still operates normally (printing
the default log message, in this case, as soon as the job is
successfully submitted) when -nowait is used.
The second way to get the results from a job that was submitted using the -nowait option is rather magical: simply reference the desired element of the job info array, and the software will automatically wait until that element has been assigned a value. The calls to LJwait in the example above can be removed to take advantage of this feature:
...
#-- Submit jobs to run simultaneously, using the "-nowait" option
LJrun lhojob -mananger lho -log -nowait {
getMetaData -returnprotocol http://out.xml -outputformat LIGO_LW
-sqlquery $query
}
LJrun llojob -mananger llo -log -nowait {
getMetaData -returnprotocol http://out.xml -outputformat LIGO_LW
-sqlquery $query
}
#-- Now retrieve the output files from each job and compare them
#-- (These calls will automatically wait until the outputs are known)
set lhofile [LJcopy $lhojob(outputs) lho_out.xml]
set llofile [LJcopy $llojob(outputs) llo_out.xml]
...
Note that reading any job info array element (except the jobtag
element) causes the global LJerror variable to be set to
0 or to 1, depending on whether that job succeeded
or failed. It is also possible to encounter a Tcl error condition, if
you attempt to read an array element that ends up never being set
because the job fails. For this reason, it is probably best to
not use the magical wait-until-filled feature, but instead to
use LJwait to explicitly wait for jobs to finish, and then to
check the value of LJerror before attempting to do anything
with the outputs from the jobs.
A further note: at present, there is no way to wait for any one out of a set of jobs to finish. This feature could be added if there is a need for it.
Caveat: the LJsave and LJrestore commands have not really been tested.
(By the way, ".tk" is the country extension for Tokelau, a small island nation in the Pacific. Check out the remarkably slick www.dot.tk web site.)
The tconvert package allows you to convert (both ways) between GPS seconds and UTC (or local) date/time strings within a Tcl script. It is part of the dataflow LIGOtools package. For more information, see the FAQ entitled "How can I convert between GPS time and UTC or local time?".
To satisfy these two distinct requirements, the ldasjob package is divided into two parts: the ldasjob.tcl library, which contains ordinary procedural code, and a separate event-driven program called LDASJobH which serves as a "helper process", taking care of the asynchronous communication with LDAS. An LDASJobH child process is automatically launched when LJrun is called for the first time in a user script, and this process handles the communication for all calls to LJrun and other ldasjob library commands, no matter how many jobs are submitted by the user script. When the user script terminates for any reason, its associated LDASJobH process terminates too. Any number of user scripts, each with its own LDASJobH process, can run simultaneously on a machine without interfering with each other.
LDASJobH has a special feature to handle the constraint imposed by LDAS on the minimum time between job submissions: when it detects this error, it automatically resubmits the job after an appropriate time interval has elapsed. This is completely transparent from the user's point of view.
For completeness, the ldasjob library includes an LJend function which instructs the LDASJobH process to immediately shut down gracefully. However, normally this is not needed, since the LDASJobH process shuts down gracefully anyway when the user script exits.