In this chapter we will go more in-depth into workflow semantics and will address more complex workflow examples. We will also show the syntax and semantic differences between various scripting languages like groovy, jruby.

1 Variables


Job Variables


Job variables are workflow variables which can either be defined statically at the workflow-level or dynamically inside a task.
They are similar to a dictionary (HashMap).
When they are defined statically, they are accessible in any task.
When they are defined dynamically, they are only accessible in children tasks.
Statically-defined job variables can as well be modified dynamically inside a task. In that case, the modification will be visible in children tasks only.
Some variables such as PA_JOB_ID, PA_TASK_ID, PA_USER, are automatically set by the system.
The following diagram illustrate this behavior:


The complete list of system variables can be found in the ProActive documentation: https://doc.activeeon.com/latest/user/ProActiveUserGuide.html#_variables_quick_reference

Let’s show how variables work by an example. Create a new job and call it Job variables.
Click on Variables tab on left panel. Then on button Add.
Enter Name : FOO Value : BAR, then click on OK.
Create a Groovy Task and replace the Task Implementation (i.e. the task script) by the following code: println "Task1:" println "FOO = " + variables.get("FOO") variables.put("FOO2","BAR2") variables.put("FOO","NOTBAR")
Inside the task script, the job variables are accessible as a HashMap called variables (its exact type is language-dependant).
Create a second groovy task and connect it to the first task as a child.
Replace the child task script by the following: println "Task2:" println "FOO = " + variables.get("FOO") println "FOO2 = " + variables.get("FOO2") println "PA_USER = " + variables.get("PA_USER") Execute the job, the following dialog will pop-up:

This lets the user change statically defined variables on-the-fly at submission. You can observe that dynamically-defined variable FOO2 is not in the list. Click on Execute to leave the original value for now.
Observe that the job output displays the expected behavior: [1t0@demo-nodes.activeeon.com;14:24:51] Task1: [1t0@demo-nodes.activeeon.com;14:24:51] FOO = BAR [1t1@demo-nodes.activeeon.com;14:24:57] Task2: [1t1@demo-nodes.activeeon.com;14:24:57] FOO = NOTBAR [1t1@demo-nodes.activeeon.com;14:24:57] FOO2 = BAR2 [1t1@demo-nodes.activeeon.com;14:24:57] PA_USER = user
In Bash scripts, job variables can be accessed by the following syntax: $variables_FOO. Variables in bash scripts are read-only. Even if they can be modified by the bash script itself, the modification will never be propagated to the children tasks.

Let’s add a third task to our Job variables workflow, choose a Bash task.
Connect this task as a child of the last groovy task.
Modify the script content with the following: echo "Task3:" echo "FOO = " $variables_FOO echo "FOO2 = " $variables_FOO2 echo "PA_USER = " $variables_PA_USER Execute the job, the following output is displayed: [2t0@demo.activeeon.com;14:30:03] Task1: [2t0@demo.activeeon.com;14:30:03] FOO = BAR [2t1@demo-nodes.activeeon.com;14:30:09] Task2: [2t1@demo-nodes.activeeon.com;14:30:09] FOO = NOTBAR [2t1@demo-nodes.activeeon.com;14:30:09] FOO2 = BAR2 [2t1@demo-nodes.activeeon.com;14:30:09] PA_USER = user [2t2@demo-nodes.activeeon.com;14:30:13] Task3: [2t2@demo-nodes.activeeon.com;14:30:13] FOO = NOTBAR [2t2@demo-nodes.activeeon.com;14:30:13] FOO2 = BAR2 [2t2@demo-nodes.activeeon.com;14:30:13] PA_USER = user You can see that the variables values are the same as the parent task. Again, any modification inside the bash script will not be propagated further. Static job variables can also be used directly in the XML workflow. The syntax used to access static job variables is ${variable_name}. In the studio, you can also use this syntax inside a parameter.

For example, click on the first task, then open the panel General Parameters. In the description field, enter : ${FOO}
Do that for the two others tasks and execute the workflow.
In the scheduler interface, click on the job, then on the first task in the task list, and then open the Task Info panel. You see how the variable has been replaced. If you click on the other two tasks, you will see the same value. At this level, the variable replacement is performed statically, i.e. at the workflow parsing.


Variables Model


When used inside tasks, Job variables are of type String.
It is possible to control the syntax of a variable, through the Model attribute.
When the Model attribute is used, the variable will still be of type String, but the variable definition will be controlled.
This can make sure that a user does not enter a wrong value which can fail the workflow.

For example, create a new workflow with a groovy task and define the following job variable (unselect the task before)

name=NUMBER value=0 model=PA:INTEGER Set the following code to the groovy task implementation: println Integer.parseInt(variables.get(“NUMBER”)) + 3 println variables.get(“NUMBER”) + 3 Submit the workflow and try to enter wrong values, for example: 1.4, TOTO, etc


Task Variables


Task Variables are similar to Job Variables, with the following differences:

  • They are defined at the Task level (Select the Task > General Parameters > Task Variables).
  • They can be used only inside the task scripts, some task parameters, but cannot be used directly in the Task XML definition.
  • They are not propagated to children tasks.
  • They have a specific extra attribute inherited which defines their behavior relative with job or propagated variables with the same name.
  • They cannot be redefined when submitting the workflow, but they can use redefined job variables through the ${} syntax or the inherited attribute.


Result Variables


Another way to transfer information between tasks is by using the task result variables.
Inside each task, it is possible to set a result by doing an affectation to a variable called result.
The direct child task will be able to access this result by another variable called results (with “s” since it can have multiple parents).
The exact type of the results variable is language-dependant, but always an aggregate type such as array or list, as it aggregate results from several parent tasks.

Let’s illustrate this by an example. Create a new job and call it Job results. Create two Groovy Tasks, position them on the same line, and replace the scripts by the following:

  • Left task : result = "a"
  • Right task: result = "b"
Create a third task and position it below the two other tasks, connect them according to this screenshot:
Replace the child task script with the following: println "results[0] = " + results[0] println "results[1] = " + results[1] As you can see, the results variable is an list-type in groovy, execute the job and observe what is printed in the job output: [4t2@demo.activeeon.com;15:04:32] results[0] = b [4t2@demo.activeeon.com;15:04:32] results[1] = a What we can observe here is that the index 0 contains the left task’s result, and index 1, the right task result. If you execute the job multiple times, the order will not change and does not depend on which task is executed first. The dependency definition order is always used.

Let’s now have a look at the type contained in list results, replace the child task script by the following: println "results[0] = " + results[0].getClass() println "results[1] = " + results[1].getClass() Execute the job and observe the output: [6t2@demo.activeeon.com;15:08:38] results[0] = class org.ow2.proactive.scheduler.task.TaskResultImpl [6t2@demo.activeeon.com;15:08:38] results[1] = class org.ow2.proactive.scheduler.task.TaskResultImpl I guess this is not exactly what you expected, as we put strings in the parent tasks result variable. The content of the results list is actually populated with TaskResultImpl java objects, which has the following API: https://doc.activeeon.com/javadoc/latest/org/ow2/proactive/scheduler/common/task/TaskResult.html
In order to access the real result object, you need to call the value() method. Replace again the child task script by the following: println "results[0] = " + results[0].value().getClass() println "results[1] = " + results[1].value().getClass() Execute the job and observe the output: [7t2@demo.activeeon.com;15:16:05] results[0] = class java.lang.String [7t2@demo.activeeon.com;15:16:05] results[1] = class java.lang.String Here we have the expected type String !
Important: In R language, it is not necessary to use the value() method inside the results object as the R script engine do that automatically.
The ProActive documentation contain detailed information about variable usages in different context:


We are now going to use what we’ve learnt about variables and results to fix the Replicate job used previously. This job was failing because we increased the number of replications without changing the different tasks semantics.

Open again the Replicate job, open the Split task’s script content: result = [0:"abc", 1:"def"] The result variable is used here to return a map of values, indexed by integers (0, 1, etc). This map contains only two values, and yet we want ten values to be used in ten different replicated tasks. So let’s change the Split task’s script with the following: result = [0:"a", 1:"b", 2:"c", 3:"d", 4:"e", 5:"f", 6:"g", 7:"h", 8:"i", 9:"j"] No need to update the replicate script which dynamically set the replication level runs=result.size() Let’s now observe the Process task: int replication = variables.get('PA_TASK_REPLICATION') input = results[0].value()[replication] result = input.toUpperCase() The first line uses an internal variable called PA_TASK_REPLICATION which contains the index of replication starting from 0, so depending on which task it is executed, it will contain the value 0, 1, 2, etc. Lets decompose the second line:
  • results[0] contains the result of the split task. All Process tasks have a single parent, so the results list contains only one TaskResultImpl object.
  • results[0].value() returns the map as defined in the split task.
  • results[0].value()[replication] returns the element in this map which this Process task must handle.
The last line simply converts this element to upper case. Let’s now observe the Merge task: println results This simply prints the content of the received results variable. For now, let’s execute the job and observe its output: [8t2@demo-nodes.activeeon.com;15:45:11] [A, B, C, D, E, F, G, H, I, J]
We see that the Merge task displayed a list containing all converted string values, in the correct order. But now, let’s say we want to print a single string containing “ABCDEFGHIJ”. Before we modify the Merge task accordingly, we need to notice that, unlike the Process task, the results list will not contain a single value, but actually ten values (the Merge task has ten parents). This simply prints the content of the received results variable. We modify the Merge task, by writing a simple loop, iterating over the results list: answer = "" for (i = 0; i < results.size() ; i++) { answer += results[i].value() } println answer Notice that it’s necessary to call the value() method to allow string concatenation (otherwise, an error will occur). Let’s execute the job and observe the output: [9t2@demo.activeeon.com;15:52:01] ABCDEFGHIJ
Do not forget to consider language specificities. Indeed in R, each array index starts at 1. The process Task would be: replication <- variables[['PA_TASK_REPLICATION']] input <- results[[1]][[replication+1]] result <- toupper(input) In ruby, the ruby array created in the Split task must first be converted to java to prevent an error: $result = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"].to_java


2 Resource selection


Resource selection allows to choose specific ProActive nodes to execute tasks. It is useful when the resource manager controls different machine groups, with different libraries installed, or even different operating systems. It can be especially useful, when heterogeneous machines are connected to the scheduler. Selection is done by writing selection scripts able to determine if a task can be executed on the ProActive Node or not.

Let’s show by an example how we can select a specific machine for execution. Create a new job in the studio named Selection job with a single groovy task
Open the Node Selection panel and click on Add This will open the following dialog:

  • Enter language : groovy
  • Enter Type : static
  • Enter script: hostname = InetAddress.getLocalHost().getHostName() println "Executed on " + hostname selected = ("trydev" == hostname)

Let’s decompose this code:
  • The first line is a java/groovy command allowing to get the hostname.
  • The second line simply prints this value.
  • The last line affects to the selected variable a boolean value, which is true if hostname is “trydev” and false otherwise.
The selected variable is like the result variable for a Task, it is the result of the selection. The selection script will be executed on many ProActive Nodes until enough node(s) have been found to execute the task(s). Selection scripts can be inlined or referenced using a catalog URL. Selection scripts can be static or dynamic :
  • Static scripts are only executed once on any Node. The result of the script is remembered by the scheduler in its cache. If another task uses the same script, the result stored inside the cache will be used as the node answer and the script will not be executed again on the same node. Static scripts are useful to check a fact that is not going to change overtime (like the hostname, the operating system, etc).
  • On the contrary, dynamic scripts, are always evaluated, and are useful to check volatile facts like CPU usage, available RAM etc.

You parametrize your selection script, by setting the Arguments field, and retrieving its value from the selection script code using the variable args.
Submit the workflow now and observe that it’s executed on trydev.
Open the job output, and look for the message “Executed on ...” : it does not appear.
The selection script output is not printed in the task logs, in order to find it, you need to open the Server Logs.
You can access the server logs, by opening the Server Logs tab on the scheduler and clicking on Fetch Output.
Even if the server logs are debug logs difficult to understand, you will see in the middle the desired output such as: ================ Task 410t0 logs ================= [2016-06-24 20:33:01,047 INFO o.o.p.r.s.ScriptExecutor] pnp://138.102.172.123:39136/SSH-xstoocky-25_12 : -1902517254 output Executed on xstoocky14 [2016-06-24 20:33:01,047 INFO o.o.p.r.s.ScriptExecutor] pnp://138.102.172.123:39136/SSH-xstoocky-25_10 : -1902517254 result false [2016-06-24 20:33:01,047 INFO o.o.p.r.s.ScriptExecutor] pnp://138.102.172.123:39136/SSH-xstoocky-25_10 : -1902517254 output Executed on xstoocky14


3 Data management


When we create a file in a task, the file will be located in the working directory of the task. This directory is called in the ProActive terminology the Local Space. This directory is volatile and will be deleted after the task is finished, so it’s mandatory to transfer any output file produced.

To illustrate this, let’s create a new job called LocalSpace job with a single Windows cmd task.
Replace the script content with the following and execute the job:

echo %cd% In the job output, you will see something like: [419t0@xstoocky11.mgps.inra.fr;23:05:23] c:\tmp\PA_JVM1915370136\419\-250683912 If you execute the job multiple times, the directories in this path will change each time. The Local Space is one of the six available ProActive DataSpaces for a job:
  • LocalSpace : temporary space created on a ProActive node when a task is executed.
  • GlobalSpace : a shared space stored on ProActive server machine, accessible to all users.
  • UserSpace : a space stored on ProActive server machine accessible only to the current user.
  • InputSpace : a private space manually started by a user (read-only).
  • OutputSpace : similar to the InputSpace but in read/write mode.
  • CacheSpace : a unique and shared space for all tasks and all ProActive Nodes deployed on the host.

Let’s create a new job and call it Userspace job.
Create two dependant groovy tasks.
In the parent task, replace the script content with: new File("a_file.txt") << "Hello World" This code create a new file containing “Hello World”)
In the child task, replace the script content with: println new File("a_file.txt").text (This code prints the content of the file a_file.txt)
You can execute this job now and observe what happens ...
The second task is faulty and the following error appears in the task logs: [14t1@demo-nodes.activeeon.com;16:47:55] Failed to execute task: javax.script.ScriptException: java.io.FileNotFoundException: a_file.txt (No such file or directory) This is because, the a_file.txt file created in the first task was created inside the LocalSpace, which is deleted at the end of the task execution.
As the second task needs this file, we must find a way to transfer it.
In order to do that, we are going to transfer the file to the UserSpace at the end of the parent task execution, and transfer again the file from the UserSpace to the second task.
Click on the first task and open the Data Management panel.
Under Output Files, click on the Add button. This will open the following dialog:

This allows to include or exclude files to/from the transfer.
Enter a_file.txt in the Includes fields and choose transferToUserSpace in the AccesMode list. (includes / excludes patterns support as well wildcards such as **/*.txt).
Similarly, in the second task, open the DataManagement panel and click on the Add button under the Input Files category.
Enter a_file.txt in the Includes fields and choose transferFromUserSpace in the AccesMode list.
Execute the job and verify that you see in the job output: [15t1@demo.activeeon.com;16:56:47] Hello World ProActive DataSpaces are a convenient way of transferring files between tasks.


4 Control structures


As we already saw previously with the replicate example, Control structures allow to build dynamic workflows with control flow decisions.
There are three kinds of control structures:

  • Replication : The replication allows the execution of multiple tasks in parallel when only one task is defined and the number of tasks to run could change.
  • Branch : The branch construct provides the ability to choose between two alternative task flows, with the possibility to merge back to a common flow.
  • Loop : The loop provides the ability to repeat a set of tasks.

All control structures are directly defined in the XML format of the workflow. Using the studio allows to drag/drop predefined structures, but it is sometimes necessary to edit directly the XML workflow to write complex scenarios. The chapter Workflow Concepts in ProActive documentation covers the necessary knowledge for editing XML workflows: https://doc.activeeon.com/latest/user/ProActiveUserGuide.html#_workflow_concepts


Each control structure rely on the execution of a Control Flow script. The execution of this scripts allows the structure to be dynamically and not statically defined. For example, in the branch structure, the control flow script allows to dynamically choose if the flow should go through the IF or through the ELSE block.
Thus, subsequent runs of the same workflow can go through different flow of executions, depending on the context. The control flow script is executed immediately after a task script. It has access to the task script result variable.

For each control structure, the control flow script must return a specific variable which determines the behavior:
  • Replication : the replication flow script must return the variable runs, which contains the number of times the child task will be executed in parallel. Example : runs = result % 4 + 1
  • Branch : the branch flow script must return the variable branch, which can contain either the string values “if” or “else”. Example: if (result % 2) { branch = "if" } else { branch = "else" }
  • Loop : the loop flow script must return a boolean value inside the loop variable. If the loop variable is true, then the flow iterates one more time. If false, then the iteration stops. Example : if (result == 4) { loop = false } else { loop = true }

In addition to control flow scripts, loops and replications use specific variables called iteration and replication indexes. These indexes allow to identify a task which is executed inside a structure. For example, inside a loop, the first time the loop is executed, the iteration index will be 0, the second time 1, etc… These indexes can be used inside scripts as internally defined variables:
  • Iteration index : variables.get(‘PA_TASK_ITERATION’)
  • Replication index : variables.get(‘PA_TASK_REPLICATION’)
These indexes are also used to name generated tasks. If you remember the replication example, the replicated Process tasks were named Process, Process*1, Process*2 , etc.
Similarly, tasks inside a loop are named Process, Process#1, Process#2, etc.

Let’s try to write a loop workflow. Create a new job and call it Loop job.
Create two dependant groovy tasks, then click on the child task and open the Control Flow panel.
In the top list, choose loop. A new green icon will appear on the task.
Click on this icon and drag the arrow to the corresponding icon on the parent task.
Now click on the loop box, and add this groovy script to loop until the iteration index reaches 5: if(variables.get('PA_TASK_ITERATION') < 5) { loop = true; } else { loop = false; } At this point, by checking the workflow, an error message should appear on the screen:
This is because, indeed our loop lacks a block structure which helps grouping semantically tasks together.

Click on the parent task and open the Control Flow panel. In the block list, choose start block.
Similarly, click on the child task, and choose end block in the same list.
Now click on the Check button in the top menu. You should see “Workflow is Valid”.
Now that we have a valid loop syntax, we must work a bit more to make the loop actually do something.
Edit the parent task script with the following: println "Loop block start" + variables.get('PA_TASK_ITERATION') Edit the child task script with: println "Loop block end" + variables.get('PA_TASK_ITERATION') Execute the workflow, observe how the loop is executed by the scheduler and how new tasks are dynamically created.
You should see at the end the following job output: [422t0@xstoocky11.mgps.inra.fr;23:50:28] Loop block start0 [422t1@xstoocky12.mgps.inra.fr;23:50:36] Loop block end0 [422t2@138.102.172.110;23:49:02] Loop block start1 [422t3@138.102.172.117;23:49:07] Loop block end1 [422t4@xstoocky09.mgps.inra.fr;23:50:53] Loop block start2 [422t5@138.102.172.110;23:49:17] Loop block end2 [422t6@138.102.172.123;23:49:19] Loop block start3 [422t7@xstoocky12.mgps.inra.fr;23:51:04] Loop block end3 [422t8@xstoocky03.mgps.inra.fr;23:51:13] Loop block start4 [422t9@138.102.172.123;23:49:35] Loop block end4 [422t10@138.102.172.110;23:49:41] Loop block start5 [422t11@xstoocky12.mgps.inra.fr;23:51:24] Loop block end5
There is also another useful feature with ProActive loop workflows, it is CRON expression loops.
CRON loops are defined exactly like standard loops except that the loop variable, instead of receiving a boolean value, receives a string CRON expression.
This expression tells the loop at which time it should execute next. The loop is paused until the expression is met.
To illustrate this, edit again the loop script, and put the following code instead: if(variables.get('PA_TASK_ITERATION') < 5) { loop = "* * * * *" } else { loop = false; } This tells the loop to wait until next minute, at each iteration, until 5 iterations are performed.
Execute the job and observe how it’s handled by the scheduler.


5 Multi-Node Tasks


We already saw briefly that a ProActive Task can reserve more than one ProActive Node for its execution. The reason behind this feature is that not all tasks simply execute a basic script, but often tasks will call an external program. That program can be multi-threaded by using multiple cores on the machine. In that scenario, it’s important to precisely match the number of ProActive Nodes used by our task with the machine resources actually used by the program. Otherwise, the scheduler could dispatch on the same machine more tasks that the machine resources can handle.

Open again the Multi node job, click on the task, and open the Multi-Node Execution panel. Open the Topology list, there are many choices in it, but we are going to focus on the two most useful ones:

  • Single Host: this topology setting tells the scheduler that the task will reserve multiple Nodes on the same machine. The number of nodes reserved is determined by the Number of Nodes parameter.
  • Single Host Exclusive: this topology setting is similar to Single Host, but also it tells the scheduler that the whole machine will be reserved to execute the task.
Choose Single Host Exclusive now, and leave the Number of Nodes = 1.
Execute the job, and observe on the Resource Manager interface how the whole machine is reserved.


6 On-Error policies


Through this course, we run many failing jobs, and each time we observed that the scheduler tries to execute a failing task several times, then continue the job execution with other tasks. This is the standard behavior for failing tasks, but each workflow can define its own failover policy.

Let’s open again, the Result job which was creating an error.
Click on the desktop outside the task to open the job parameter panel.
Click on the panel Error Handling.
Here you can see the Maximum Number of Executions Attempts (2 by default), and two other settings:

  • On Task Error Policy : this tells the scheduler how it should react when an error occurs. The scheduler can choose to cancel the whole job if all executions attempts were made, suspend the job after the first error occurs, etc
  • If An Error Occurs Restart Task : this tells the scheduler where to restart the task if an error occurs. By default, it is unspecified, but we can ask the scheduler to restart the task on a different ProActive Node than the one used already.
Choose suspend task after first error and pause job immediately and execute the job.

As you can see, after the error occurs, the job is paused in a state called In-Error. This state let the user investigate what went wrong before resuming the job. In the current ProActive version, the task script cannot be modified, but for example, databases or files used by the task could be checked and fixed, etc.
In order to resume the execution, right-click on the job and choose, for example Restart All In-Error tasks. If the task fails again (here it will), the job will be paused again and will need to be resumed, until all execution attempts were tried.


7 Fork Environment


When a ProActive Task is executed on a ProActive Node, a dedicated Java Virtual Machine is started to execute the Task.
The forked JVM parameters are automatically configured by the ProActive Node, but sometimes it may be necessary to provide additional configurations to the JVM. This configuration can be performed thanks to a Fork Environment or Fork Environment Script.

Let’s demonstrate this by an example. Create a bash Task containing the pwd command to display the current directory.
Execute this task, you should see in the output something like:

/tmp/PA_JVM1063826057/local-LocalBis-0_1/9t0/1267788085 Now let’s change the fork environment for this task. From the Fork Environment menu, in working folder, enter: /tmp/${PA_JOB_ID} Execute the workflow again, you should now see in the output something similar to: /tmp/10 Fork environment can be used to define multiple parameters such as working folder, environment variables, it can also be used to execute a task inside a docker container.


8 Containers


ProActive natively supports containers (Docker, Kubernetes,..). As a first approach, let’s run a simple bash command from a basic Linux container.
From the Studio, drag’n drop a Dockerfile task (Tasks->Containers->Dockerfile).
Open the Task Implementation view to see the script: It simply prints “Hello” from a freshly running Ubuntu container.
Execute the workflow. In case the docker image is not present locally (here Ubuntu 18.04), you will see image-pull-related logs in the job output.

INFO - Docker file /tmp/PA_JVM1294821022/local__localhost__0_2/12t0/2018491587/Dockerfile created. [12t0@trydev.activeeon.com;15:32:56] Docker file /tmp/PA_JVM1294821022/local__localhost__0_2/12t0/2018491587/Dockerfile created. [12t0@trydev.activeeon.com;15:32:56] INFO - Running command: [docker, build, -t, image_12t0, .] [12t0@trydev.activeeon.com;15:32:56] Running command: [docker, build, -t, image_12t0, .] [12t0@trydev.activeeon.com;15:32:56] Sending build context to Docker daemon 5.632kB [12t0@trydev.activeeon.com;15:32:57] Step 1/2 : FROM ubuntu:18.04 [12t0@trydev.activeeon.com;15:32:58] 18.04: Pulling from library/ubuntu [12t0@trydev.activeeon.com;15:32:58] 23884877105a: Pulling fs layer [12t0@trydev.activeeon.com;15:32:58] bc38caa0f5b9: Pulling fs layer [12t0@trydev.activeeon.com;15:32:58] 2910811b6c42: Pulling fs layer [12t0@trydev.activeeon.com;15:32:58] 36505266dcc6: Pulling fs layer [12t0@trydev.activeeon.com;15:32:58] 36505266dcc6: Waiting [12t0@trydev.activeeon.com;15:32:58] bc38caa0f5b9: Download complete [12t0@trydev.activeeon.com;15:32:58] 2910811b6c42: Verifying Checksum [12t0@trydev.activeeon.com;15:32:58] 2910811b6c42: Download complete [12t0@trydev.activeeon.com;15:32:59] 36505266dcc6: Verifying Checksum [12t0@trydev.activeeon.com;15:32:59] 36505266dcc6: Download complete [12t0@trydev.activeeon.com;15:32:59] 23884877105a: Verifying Checksum [12t0@trydev.activeeon.com;15:32:59] 23884877105a: Download complete [12t0@trydev.activeeon.com;15:33:02] 23884877105a: Pull complete [12t0@trydev.activeeon.com;15:33:02] bc38caa0f5b9: Pull complete [12t0@trydev.activeeon.com;15:33:02] 2910811b6c42: Pull complete [12t0@trydev.activeeon.com;15:33:02] 36505266dcc6: Pull complete [12t0@trydev.activeeon.com;15:33:02] Digest: sha256:3235326357dfb65f1781dbc4df3b834546d8bf914e82cce58e6e6b676e23ce8f [12t0@trydev.activeeon.com;15:33:02] Status: Downloaded newer image for ubuntu:18.04 [12t0@trydev.activeeon.com;15:33:02] ---> c3c304cb4f22 [12t0@trydev.activeeon.com;15:33:02] Step 2/2 : RUN echo "Hello" [12t0@trydev.activeeon.com;15:33:03] ---> Running in 3df0c7a81da8 [12t0@trydev.activeeon.com;15:33:04] Hello
Beyond the fact Docker is natively supported as ProActive task, Docker is natively supported as execution environment of any ProActive task.
Suppose you want to run a R script (statistics) and do not want to mess your system with a R installation. Try this.
Drag’n drop a R task (Tasks->Languages->R).
Click on the task, then under Fork Environment->Environment Script, click on to import the fork_env_docker_rbase_datadir script.
Click on to see the script content: this groovy script build the docker run command (with required mounted dir, environment variables,..) to wrap the ProActive task script execution.
Now, go to the Task Implementation view and replace the existing R script by this one: x <- pi ^ (1:5) x print(x) Execute the job. You will see from the Job output view the following result: 14t0@trydev.activeeon.com;15:56:28] Operating system : Linux [14t0@trydev.activeeon.com;15:56:28] DOCKER COMMAND : [docker, run, --rm, --env, HOME=/tmp, -v, /home/cperUser/opt/proactive/scheduling:/home/cperUser/opt/proactive/scheduling, -v, /tmp/PA_JVM1294821022/local__localhost__0_1/14t0/2018491594:/tmp/PA_JVM1294821022/local__localhost__0_1/14t0/2018491594, -v, /tmp/cache:/tmp/cache, -v, null:/tmp/data, -v, /home/cperUser/opt/proactive/scheduling/jre:/home/cperUser/opt/proactive/scheduling/jre, -w, /tmp/PA_JVM1294821022/local__localhost__0_1/14t0/2018491594, --user=1002:1002, activeeon/r-base:latest] [14t0@trydev.activeeon.com;15:56:32] [1] 3.141593 9.869604 31.006277 97.409091 306.019685


9 Generic Information


Generic Information key features:

  • Similar to variables, can be used in a dictionary-like fashion inside ProActive Tasks.
  • A predefined set of generic info is interpreted by the scheduler, the studio or the node, to customize some functionalities.
For example:
  • START_AT : a date which defines when a task or a job is scheduled (interpreted by the Scheduler).
  • PYTHON_COMMAND : the path to the python command to execute in CPython tasks (interpreted by the Node).
  • TASK.ICON : Interpreted by the Studio to display a custom icon for the task.

Let’s show an example of the START_AT generic info. Create a Workflow with a single task and add this generic info at the job level (Generic Info menu): Property Name: START_AT Property Value: 2022-05-28T15:00:00+02:00 Execute the job. In the scheduler portal, select the job, and open the Job Variables panel:
You can see how the job execution is delayed.


10 Third-Party Credentials


Third-Party credentials key features:

  • Similar to variables,can be used in a dictionary-like fashion inside ProActive Tasks.
  • Encrypted inside the scheduler.
  • Defined for each user, each workflow run by this user can use the same private set of credentials.
  • Can be defined using the Scheduler Portal or the REST API.

Let’s create a credential data. From the scheduler portal, open Portal->Manage Third-Party Credentials.
In the bottom form, enter myKey / somevalue and click on Add.
To use these credentials inside a task, create a groovy Task with the following code: println credentials.get("myKey") Execute the workflow and observe the output: [29t0@trydev.activeeon.com;13:30:21] myKey : value