Tasks List

The following is an updated list of Job Tasks for Simflofy. Each has its own section for details on its purpose, how and when to use it, and how to configure it.

ACL Converter

This task is for advanced users who wish to write their own ACL Conversion java code. In order to do this, the class must implement the IACLConverter Interface. The task must be packaged in the simflofy-admin/WEB-INF/lib folder. Here are Oracles docs on how to create a jar file. You will need to first compile the class, then package it. From there, the task will execute the implementation of the method. The Map produced will be added to a specific field on each document called "Transformed Permissions".

The task's one field should be the package + class name for your custom class.


Map<String, Set<String>> perms = this.aclConverter.convertACLs(rd.getACL(), outputHelper.isFolder(rd));
rd.setTransformedPermissions(perms);

Any connector with Process ACLS as an option uses the transformed permissions.

Full IACLConverter Class

package com.simflofy.core.common.output.acl;

import java.util.Map;
import java.util.Set;

/**
 * @author mlugert
 */
public interface IACLConverter {
    /**
     * @param rawAcls
     * @return Takes a list of ACLs in this format:
     * principle=permission1,permission2,permission3
     * <p>
     * and converts them to a map with the key=principle
     * and the set being a list of permissions
     * <p>
     * Each converter will be specific to a customer/repo
     * The permissions will be tranformed from the repo ACLs
     * into the output ACLs that the customer wants.
     */
    Map<String, Set<String>> convertACLs(String[] rawAcls, boolean isFolder) throws Exception;
}

Alfresco Property Mapping Nodes

Purpose: This task is for getting existing node references in an Alfresco instance, in order to update them, rather than create a new one if the existing file has moved from its original ingestion location.

Use cases: It's used in jobs where the Alfresco Output connector is used. Typically, in incremental or sync jobs, where changes to a file from a source system need to make it to the node it previously integrated to.

How to set up: Provide the query JSON object, which is then sent to Alfresco which will return the matching Alfresco nodeReference, which we'll use to update the document.

Exceptions/requirements: If the query is too vague, and matches more than one node/document, then none of the documents will be updated. The query has to uniquely find a single node/document in Alfresco.

Alfresco Job Run History Nodes

Purpose: Getting the Alfresco Node Reference from the Current Job Run History, in order to update an existing document from a previous Job run in Alfresco, rather than creating a new one.

Use Case: It's used in Incremental or Sync jobs where the Alfresco connector is being used. This task uses the Source Repository ID from the Repository the document is pulled from, in order to find the document in the Job Run History.

Set up: There are no configuration options for this task. It uses the Document's Source Repository ID in order to find the node Reference in Alfresco.

Exception/Requirements: Since the Source Repository ID is used, any source repositories that use a Path Based Source ID will not work if the path of the document changes.

Apply Security Marks

Used to add Security Marks to documents during migration. The task uses javascript to push Security Mark Ids (accessible through the API) to an array called results. These results will be added to the secMarks field on the document. You will still need to map this field to apply them a document in the indexing engine.

AWS Rekognition Image

Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using the AWS Rekognition system.

For your reference, these are 'MinConfidence' and 'MaxLabels' here. This page is what this task uses.

Task Configuration

Minimum Threshold: The minimum confidence threshold for labels to return following label detection. Labels with a confidence level lower than this will not be returned.
Max Number of labels: The maximum number of recognized labels to be returned, by highest confidence.
Auth conn: Your Amazon AWS credentials
Max number of labels: The number of labels you want to return

AWS Textract

Extracts text from PNGs, JPGs, and PDFs and stores it on the repository document in the simflofy_ai_text field.

Task Configuration:

Authentication Connection: Your Amazon AWS credentials

Reference documentation: Amazon Textract

Basic JDBC Job

User Name: JDBC User Name

Password: JDBC Password

Driver Class: The driver for the JDBC database. A number of possible drivers are listed here.

caution

Ensure that your driver jar file is in the simflofy-admin/WEB-INF/lib folder at startup.

JDBC Url: Url to connect to the database. Each type of database uses a different format. Refer to the linked table for formats.

ID Field: The field (without table name) which will be used to name binaries queried from the database.

Query: The query to execute. All results will be added to the document in the format [tableName].[fieldName]

Buffer Binaries to File System

As the name says, this task will buffer files to temporary directory. In this case the value of the System Property java.io.tmpdir. The task takes a timeout which will fail the document if it does not complete staging in the given amount of time. The default is 10 seconds, but for larger files (100+ MB), we recommend 60 seconds.

Clean Array Value To String

Converts a property on a repository document from an array to a String.

Delimiter: The delimiter to place in between values of the Array. If left empty will default to space

Property: The array repository document property to convert into a String. The new String property will have the same name.

CMIS ACL modification

This task is meant to be used with a CMIS Repository connection. It will use the Repository Document id, and gather the current ACL for the document. It will then generate a new ACL based on the parameters. This task establishes a session upon initialization, and keeps it open until the job run is complete.

The principal lists are pipe (|) delimited, to account for LDAP style principals. You will need to know the exact principal ids of the ACEs (Access Control Entries). The task works by cycling through the current ACEs and a series of if-then logic to construct a new ACL.

User name: CMIS server user name
Password: CMIS server password
Connection URL: CMIS server url
Repository ID: CMIS repository ID
Pipe (|) delimited list of principals to ignore from each document: A list of principals to ignore and not add modifications to. The 'ignore' list will bypass any modification to a matchingACE and add it directly back to the new ACL.
Pipe (|) delimited list of principals to remove from each document: A list of principals to be removed from the ACL. The remove list will skip over a matching ACE, leaving it out.
Pipe (|) delimited list of principals to add to each document: A list of principals to be added to the ACL. The add list is checked separately and will generate a new ACE will the selected permissions.
Permission to add to the principals: Not ignored ACEs and added ACEs will have this permission added to them in the ACL.

Convert To UTC

As the task name states, will convert date fields to the given format, with the given offset.

Date Fields to Convert: A comma delimited list of repository fields or the output of calculated field.

If you do not supply field names, the task will attempt to process all the Job's Datetime mappings. Note: only calculated field mappings are available to job processors.

Offset:Timezone offset.

Date Time Format: Final format for the field you're converting.

Define Date Time Format Checkbox: Check to define an output format, uncheck to return default format (i.e. 2019-11-05T16:00:00Z)

Create Preview File

Creates a thumbnail from an image or video.

Staging Location: The location on the filesystem to place the thumbnail.

Date Based Folder Path

This task takes ones of the date fields on the Repository Document and uses it to generate the parent folder path for the document. This mimics how Alfresco stores its data in the filesystem

Repository Document Field: the field to use for the path - can be any date field

Pattern: The pattern used to break down the date into a folder path

The default is /yyyy/M/dd/kk/mm/ss/

In this case, kk is used for hours.

Example

For the document file.txt, if you set the field to used the modified date '#{rd.modifieddate}' and the date is 1999-04-20T12:01:23, then your path with the default pattern will be

/1999/4/20/12/01/23/file.txt

Duplication Check

This task checks the chosen field to find duplicate documents.

Field to Compare: The field whose value will be used to check for duplicates. If this value is found in any other document it will be considered a duplicate. Default is the file hash

If you wish to compare file hashes (a sort of fingerprint for a document), you will need to precede this task with a Hash Generator Task.

Duplication Check Scope: Check all documents, or just the ones associated with this job.

Action: What to do if a document is found. The Document can be skipped, failed, or simply logged.

EML Metadata Extraction

An EML file is an email message saved by an email application, such as Microsoft Outlook or Apple Mail. It contains the content of the message, along with the subject, sender, recipient(s), and date of the message. EML files may also store one or more email attachments, which are files sent with the message.

This task extracts metadata from emails and adds as fields to the repository document. These extracted fields appear as:

Email.Subject: The subject line of the email.
Email.From: The 'from' line of the email.
Email.To: The 'to' line of the email.
Email.Body: The body of the email.
Email.CC: The CC line of the email.
Email.BCC: The BCC line of the email.
Email.EmailList: Space separated list of To, From and CC addresses.
Email.InternetMessageId: The message ID of the email.
Email.Size: The size of the email.
Email.HasAttachments: Whether the email has attachments or not.
Email.Attachments: Comma delimited list of email attachments by name. Blank if none.
Email.AttachmentCount: The number of attachments in this email.
Email.DateTimeSent: The date time the email was sent.
Email.DateTimeCreated: The date time the email was created.
Email.LastModifiedTime: The modified date time of the email.
Email.DateTimeReceived: The date and time the email was received.

Extract Metadata From Path

Path Field: Which field to use to extract metadata. Options are

Repository ID (for repos that use the entire path as an ID, such as Filesystem, or Amazon S3)
Parent Path, for repos that don't use paths as ID.

Path Rules: Comma delimited list of rules to extract. The format is [New Field Name]=#{Location}

Location takes the form of an integer, starting at 0 for the root folder.
Additionally, 'filename','parent',and 'grandparent' with **quotes included can be used.

Example

For the file

/accounting/healthcare/userName.pdf

and the rules

department=#{0},type=#{'parent'}

department=accounting

type=healthcare

Field Lookup

This task is intended to allow users the ability to perform a look-up operation and update the matching fields of the repository documents. This task was originally created for integrations in which certain field values could only be obtained by importing them from an external source.

Task Name: Name of the Lookup Job Task

Delimiter: Denotes the delimiter character if values are not in list format.

Property Field: Path Uses preset configurations to evaluate values against the path

File Name Uses preset configurations to evaluate values against the file name
Other Property Evaluates a property entered by the user against the given values

Property Name: A specific property entered by the user. This property will be used only if Other Property is selected from the property field dropdown list.

Values: Key=Value listing to be used as the source data for the lookup operation.

Look for an exact match: If set to true, properties that have an exact (whole case) match will be updated. If set to false, properties that contain the entered value(s) will be updated. For instance, if set to true the value 123 will only have a match with 123. If false, a match will be identified if 123 is present in any part of the field and only that portion will be updated with the new value.

Simflofy can utilize any properties that are associated with a repository document. We recommend running a BFS output job with no mappings and Include Un-Mapped Properties set to True. This will generate a xml file similar to the example below and allow you to see what properties are available for your documents:

<properties>  
    <entry key="document.name">Simflofy Overview.doc</entry>  
    <entry key="type">document</entry>  
    <entry key="folderpath">testSimflofy_Setup</entry>  
    <entry key="document.Culture">en-US</entry>  
    <entry key="document.Customer">123456</entry>  
    <entry key="document.Category">training</entry>  
</properties>

In the example above you will notice that the document.Customer field has a numerical value associated with it, but not the actual customer name. The task will allow you to import a list of customer IDs along with their associated customer names and update those values as desired.

The updated values will then be available in subsequent job tasks. For instance, you could then use override folder path task to build out a folder structure with the updated values. Using the scenario above, you could build out a folder for each customer by name instead of numerical value.

File Name Extraction

Extracts the file name from another field using regex(Regular Expressions). It will set the file name to the value that matches the regex.

Regex to match: The regex to match and convert to and set as the file name.

Data Field: The repository document field to match the regex on.

Example - Alphanumerics

The pattern for alphanumeric characters is

[a-zA-Z0-9], or if you wish to include underscores \w

To select for non-alphanumeric characters we add the carat (^) before the pattern, so

^[a-zA-Z0-9]

The character simply translates to "Not", so it negates whatever is after it.

Example - Clearing Unwanted Spaces

The pattern \s is regex shorthand for "spaces". If you're worried about tabs, line breaks etc. add an asterisk (*) after the pattern for what is called "greedy" selection.

Adding this as your regex and setting the replacement as ''

Filename Cleanse

This task uses regex (Regular Expressions) to alter filenames.
Common use cases are clearing unwanted characters, such as whitespace, or non-alphanumeric characters.

Regex to Match: The regex pattern to search for in the filename.

The default value matches any character that isn't a letter, number, space or period an unlimited number of times.

Replacement: Replacement for matches.

Example - Alphanumerics

The pattern for alphanumeric characters is

[a-zA-Z0-9], or if you wish to include underscores \w

To select for non-alphanumeric characters we add the carat (^) before the pattern, so

^[a-zA-Z0-9]

The character simply translates to "Not", so it negates whatever is after it.

Example - Clearing Unwanted Spaces

The pattern \s is regex shorthand for "spaces". If you're worried about tabs, line breaks etc. add an asterisk (*) after the pattern for what is called "greedy" selection.

Adding this as your regex and setting the replacement as ''

FileNet ACL Modification

Deprecation

This task will be converted to a post-processor in 3.2

The purpose of this task is change the permission lists of integrated documents in the IBM FileNet Repository.

Authentication Connector ID: The ID of your P8 Authentication connection is available in the Authentications Connection under the Integration menu. You can find it the url of the edit or view page for the connection
Change List: Pipe (|) delimited list of principals to change from each document.
Permissions to Change: Access level for the changed permissions.
Add List: Pipe (|) delimited list of principals to add to each document
Permissions to Add: The access level to add to the new permissions.
Removal List: Pipe (|) delimited list of principals to remove from each document.

Note that any permission not added to these lists will be ignored.

Filesystem ACL Extraction

Extracts ACLs from the Windows or Linux filesystem document and adds them to the repository document.

File System Operating System: The operating system that these files are being read from.

Append _DENY to Deny ACL Type permissions?: Appends _DENY to the end of an extracted permission if it's of the type DENY, in case you want to track this later on.

This task will have some different behavior depending on your operating system. In a POSIX environment (macOS or Linux) permissions may be added as the field document.permissions with the permissions in a semicolon(;) delimited list, if any exist.

If the filesystem supplies an owner, it will be added as simflofy.owner

Additionally, simflofy will create a permission map of the principals and their permissions. It will set is as the originalPermissions field, so


Map<String, Set<String>> permissions = new Map<>();

//process acls

rd.setOriginalPermissions(permissions);

Finally, if any User Defined File Attributes (extended attributes), they will be added as a semicolon delimited list in the field

simflofy.userattributes

Filter Expression

The filter expression task allows you to remove files based on the expression used. The task has two fields, the task name and the filter expression. The task name is arbitrary but should identify the type of filter you're creating.

Filter Expression

The filter expression can use any of the Simflofy Expression Language components to form simple or more complex expressions for evaluation by Simflofy.

If the expression evaluates to true, the document will continue to be processed.

Example: If you want to exclude the OS level file Thumbs.db in your job, you can filter those files out using the following statement: !equalsIgnoreCase('#{rd.filename}', 'Thumbs.db')

The above example will find all the files that have the name 'Thumbs.db' and set the expression to false for the documents. Therefore, all files with the name 'Thumbs.db' will be skipped.

Folder Path Cleanse

This task uses Regular Expressions (regex) to alter Folder Names.Functionally identical to Filename Cleanse Task, except that is changes the document parent path.

Regex to Match: The regex pattern to search for in the Folder Name.

The default value matches any character that isn't a letter, number, space or period an unlimited number of times.

Replacement: Replacement for matches.

FTP

Sets the binary of a document to a binary found on an FTP server at the location found using the Path Expression.

Username: The username needed to log in to your FTP server
Password: The password needed to log in to your FTP server
Host URL: The URL of the FTP server
Encrypt Data: Sets whether to encrypt the data connection to the FTP server with TLS. Note that Use FTPS must be checked if you use this.
Use FTPS: Sets whether to use FTP with SSL protocol.
Is Implicit: Sets whether to use FTP implicit SSL or FTP explicit SSL
EPSV With IPV4: Sets whether to use EPSV with IPV4.
Server Port: The port your FTP server is listening on
Thread Count: Number of FTP poster threads

Generic ACL Mapper

The generic ACL mapper job task allows you to create simple rules for matching principles and permissions from one system to another. ACLs will need to be extracted from each document. This task reads the originalPermissions field of the document and sets the transformedPermissions field.

Limited Usage

Only the Azure Blob, CMIS, and Alfresco Connectors can use this task. For all other acl mapping, a Javascript task or ACL Converter Task is required

Process Files and Process Folders tells the task what to process.
Mapping Rules: These rules will map the role/permissions on the left with the ones you want to match on the right.
- Permissions on the right will be from the source and those on the left will be for the target.
- Both side of a rule can be a comma delimited list.
- Each rule must end with a semicolon (;)
Authority Cleanup: A comma delimited list of principals to remove as part of the task.
Enabling parsing: Enable parsing for multiple permissions to be assigned to an authority after mapping?
- For example, [Write]=WRITE_DATA with this box checked will allow 'Write' to be added on as an extra permission to an authority's newly mapped permissions.

Google Vision Image Labels

Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using Google Vision.

Max Number of labels: The maximum number of recognized labels to be returned

Authentication connection: Your authentication connection for Google. You can find it in the url while edit or view page for the connection

Google Vision Text Extraction

Extracts text from .tiff, .pdf and .gif files and stores it on the repository document in the simflofy_ai_texts field.

Authentication Connector: Your authentication connection for Google You can find it in the url while edit or view page for the connection

Hash Value Generator

Creates a hash of the document content and sets it on the repository document. Can be used in tandem with the Duplication Check Task to find duplicates.

Staging: Where to stage the data locally, while creating the MD5 Hash.

Hash Type: The type of hash to use. MD5 is the default hash, but SHA hashes are available

HTML to PDF

This task takes a single argument, which is a file path to an XHTML stylesheet ( *.xsl).

The binary stream is taken from each repository document and converts it to a pdf using the template.

Here's a default stylesheet to save and try.

Less complex examples can be found here

Sample XSL Style Sheet

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xhtml="http://www.w3.org/1999/xhtml"
  exclude-result-prefixes="xhtml">
<xsl:output method="xml" version="1.0" standalone="yes" omit-xml-declaration="yes" 
  encoding="utf-8" media-type="text/xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:preserve-space elements="xhtml:listing listing xhtml:plaintext plaintext xhtml:pre pre xhtml:samp samp"/>
<xsl:param name="font-size" select="''"/>
<xsl:param name="font.symbol" select="'Arial Unicode MS'"/>

<xsl:template name="common-atts">
  <xsl:copy-of select="@id|@color|@height|@width|@xml:lang"/>
  <xsl:if test="@align"><xsl:attribute name="text-align"><xsl:value-of select="@align"/></xsl:attribute></xsl:if>
  <xsl:if test="@nowrap"><xsl:attribute name="wrap-option">no-wrap</xsl:attribute></xsl:if>
</xsl:template>

<xsl:template match="xhtml:html|html">
  <fo:root>
    <fo:layout-master-set>
      <fo:simple-page-master master-name="page">
        <fo:region-body margin=".75in .75in .75in .75in"/>
        <fo:region-before extent=".5in"/>
        <fo:region-after extent=".5in"/>
      </fo:simple-page-master>
    </fo:layout-master-set>
    <fo:page-sequence master-reference="page">
      <fo:static-content flow-name="xsl-region-before">
        <fo:block display-align="after" padding-before=".2in" text-align="center" font-size="9pt">
          <xsl:apply-templates select="xhtml:head/xhtml:title|head/title"/>
        </fo:block>
      </fo:static-content>
      <fo:static-content flow-name="xsl-region-after">
        <fo:block display-align="before" text-align="center" font-size="8pt">
          <xsl:text>page </xsl:text>
          <fo:page-number/><xsl:text> of </xsl:text>
          <fo:page-number-citation ref-id="__END__"/>
        </fo:block>
      </fo:static-content>
      <xsl:apply-templates/>
    </fo:page-sequence>
  </fo:root>
</xsl:template>

<xsl:template match="xhtml:title|title">
  <fo:block><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:basefont|basefont">
  <xsl:copy-of select="@color"/>
  <xsl:choose>
    <xsl:when test="@size=1"><xsl:attribute name="font-size">xx-small</xsl:attribute></xsl:when>
    <xsl:when test="@size=2"><xsl:attribute name="font-size">x-small</xsl:attribute></xsl:when>
    <xsl:when test="@size=3"><xsl:attribute name="font-size">small</xsl:attribute></xsl:when>
    <xsl:when test="@size=4"><xsl:attribute name="font-size">medium</xsl:attribute></xsl:when>
    <xsl:when test="@size=5"><xsl:attribute name="font-size">large</xsl:attribute></xsl:when>
    <xsl:when test="@size=6"><xsl:attribute name="font-size">x-large</xsl:attribute></xsl:when>
    <xsl:when test="@size=7"><xsl:attribute name="font-size">xx-large</xsl:attribute></xsl:when>
  </xsl:choose>
  <xsl:if test="@face"><xsl:attribute name="font-family"><xsl:value-of select="@face"/></xsl:attribute></xsl:if>
</xsl:template>

<xsl:template match="xhtml:body|body">
  <fo:flow flow-name="xsl-region-body">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates select="//basefont[1]"/>
    <xsl:if test="$font-size"><xsl:attribute name="font-size"><xsl:value-of select="$font-size"/></xsl:attribute></xsl:if>
    <xsl:apply-templates/>
    <fo:block id="__END__"/>
  </fo:flow>
</xsl:template>

<xsl:template match="xhtml:head|head|xhtml:applet|applet|xhtml:area|area|xhtml:base|base
  |xhtml:bgsound|bgsound|xhtml:embed|embed|xhtml:frame|frame|xhtml:frameset|frameset|xhtml:iframe|iframe
  |xhtml:ilayer|ilayer|xhtml:layer|layer|xhtml:input[@type='hidden']|input[@type='hidden']
  |xhtml:isindex|isindex|xhtml:link|link|xhtml:map|map|xhtml:meta|meta|xhtml:object|object|xhtml:param|param
  |xhtml:ruby|ruby|xhtml:rt|rt|xhtml:script|script|xhtml:spacer|spacer|xhtml:style|style|xhtml:wbr|wbr
  |xhtml:xml|xml|xhtml:xmp|xmp"/>
<xsl:template match="comment">
  <xsl:comment><xsl:apply-templates/></xsl:comment>
</xsl:template>
<xsl:template match="processing-instruction()">
  <xsl:copy-of select="."/>
</xsl:template>

<!-- Links and Media -->

<xsl:template match="xhtml:a|a">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:if test="@name and not(@id)">
      <xsl:attribute name="id"><xsl:value-of select="@name"/></xsl:attribute>
    </xsl:if>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:a[@href]|a[@href]">
  <fo:basic-link color="blue" text-decoration="underline">
    <xsl:if test="@type"><xsl:attribute name="content-type"><xsl:value-of select="@type"/></xsl:attribute></xsl:if>
    <xsl:choose>
      <xsl:when test="starts-with(@href,'#')">
        <xsl:attribute name="internal-destination"><xsl:value-of select="substring-after(@href,'#')"/></xsl:attribute>
      </xsl:when>
      <xsl:otherwise>
        <xsl:attribute name="external-destination">
          <xsl:text>url(&apos;</xsl:text>
          <xsl:value-of select="concat(//base/@href,@href)"/>
          <xsl:text>&apos;)</xsl:text>
        </xsl:attribute>
      </xsl:otherwise>
    </xsl:choose>
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:basic-link>
</xsl:template>

<xsl:template match="xhtml:img|img|xhtml:input[@type='image']|input[@type='image']">
  <fo:external-graphic content-type="{@type}" src="{concat(//base/@href,@src)}">
    <xsl:call-template name="common-atts"/>
  </fo:external-graphic>
</xsl:template>

<xsl:template match="xhtml:object[starts-with(@type,'image/')]|object[starts-with(@type,'image/')]">
  <fo:external-graphic content-type="{@type}" src="{concat(//base/@href,@data)}">
    <xsl:call-template name="common-atts"/>
  </fo:external-graphic>
</xsl:template>

<!-- Tables -->

<xsl:template match="xhtml:table">
  <xsl:apply-templates select="caption"/>
  <fo:table width="100%"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates select="colgroup|col"/>
    <xsl:variable name="tr1" select="(xhtml:tr|xhtml:thead/xhtml:tr|xhtml:tbody/xhtml:tr|xhtml:tfoot/xhtml:tr)[1]"/>
    <xsl:variable name="cols" select="xhtml:colgroup/xhtml:col|xhtml:col"/>
    <xsl:call-template name="mock-col">
      <xsl:with-param name="cols" select="(count($tr1/xhtml:*[not(@colspan)])+sum($tr1/xhtml:*/@colspan))
        -(count($cols[not(@colspan)])+sum($cols/@colspan))"/>
    </xsl:call-template>
    <xsl:apply-templates select="xhtml:thead|xhtml:tfoot|xhtml:tbody"/>
    <xsl:if test="xhtml:tr">
      <fo:table-body><xsl:call-template name="common-atts"/>
        <xsl:apply-templates select="xhtml:tr"/>
      </fo:table-body>
    </xsl:if>
  </fo:table>
</xsl:template>

<xsl:template match="table">
  <xsl:apply-templates select="caption"/>
  <fo:table width="100%"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates select="colgroup|col"/>
    <xsl:variable name="tr1" select="(tr|thead/tr|tbody/tr|tfoot/tr)[1]"/>
    <xsl:variable name="cols" select="colgroup/col|col"/>
    <xsl:call-template name="mock-col">
      <xsl:with-param name="cols" select="(count($tr1/*[not(@colspan)])+sum($tr1/*/@colspan))
        -(count($cols[not(@colspan)])+sum($cols/@colspan))"/>
    </xsl:call-template>
    <xsl:apply-templates select="thead|tfoot|tbody"/>
    <xsl:if test="tr">
      <fo:table-body><xsl:call-template name="common-atts"/>
        <xsl:apply-templates select="tr"/>
      </fo:table-body>
    </xsl:if>
  </fo:table>
</xsl:template>

<xsl:template match="xhtml:colgroup|colgroup">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template name="mock-col">
  <xsl:param name="cols" select="1"/>
  <xsl:if test="$cols&gt;0">
    <fo:table-column column-width="proportional-column-width(1)"/>
    <xsl:call-template name="mock-col">
      <xsl:with-param name="cols" select="$cols -1"/>
    </xsl:call-template>
  </xsl:if>
</xsl:template>

<xsl:template match="xhtml:col|col">
  <fo:table-column><xsl:call-template name="common-atts"/>
    <xsl:if test="@span">
      <xsl:attribute name="number-columns-spanned"><xsl:value-of select="@span"/></xsl:attribute>
    </xsl:if>
    <xsl:choose>
      <xsl:when test="@width">
        <xsl:attribute name="column-width"><xsl:value-of select="@width"/></xsl:attribute>
      </xsl:when>
      <xsl:otherwise>
        <xsl:attribute name="column-width">proportional-column-width(1)</xsl:attribute>
      </xsl:otherwise>
    </xsl:choose>
  </fo:table-column>
</xsl:template>

<xsl:template match="xhtml:tbody|tbody">
  <fo:table-body><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:table-body>
</xsl:template>

<xsl:template match="xhtml:thead|thead">
  <fo:table-header><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:table-header>
</xsl:template>

<xsl:template match="xhtml:tfoot|tfoot">
  <fo:table-footer><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:table-footer>
</xsl:template>

<xsl:template match="xhtml:tr|tr">
  <fo:table-row><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:table-row>
</xsl:template>

<xsl:template match="xhtml:th|th">
  <fo:table-cell font-weight="bold" padding=".1em"><xsl:call-template name="common-atts"/>
    <xsl:if test="@colspan">
      <xsl:attribute name="number-columns-spanned"><xsl:value-of select="@colspan"/></xsl:attribute>
    </xsl:if>
    <xsl:if test="@rowspan">
      <xsl:attribute name="number-rows-spanned"><xsl:value-of select="@rowspan"/></xsl:attribute>
    </xsl:if>
    <fo:block>
      <xsl:if test="parent::xhtml:tr/parent::xhtml:thead|parent::tr/parent::thead">
        <xsl:attribute name="text-align">center</xsl:attribute>
      </xsl:if>
      <xsl:apply-templates/>
    </fo:block>
  </fo:table-cell>
</xsl:template>

<xsl:template match="xhtml:td|td">
  <fo:table-cell padding=".1em"><xsl:call-template name="common-atts"/>
    <xsl:if test="@colspan">
      <xsl:attribute name="number-columns-spanned"><xsl:value-of select="@colspan"/></xsl:attribute>
    </xsl:if>
    <xsl:if test="@rowspan">
      <xsl:attribute name="number-rows-spanned"><xsl:value-of select="@rowspan"/></xsl:attribute>
    </xsl:if>
    <fo:block>
      <xsl:apply-templates/>
    </fo:block>
  </fo:table-cell>
</xsl:template>

<!-- Lists -->

<xsl:template match="xhtml:dd|dd">
  <fo:list-item><xsl:call-template name="common-atts"/>
    <fo:list-item-label><fo:block/></fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>
        <xsl:apply-templates/>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
</xsl:template>

<xsl:template match="xhtml:dl|dl">
  <fo:list-block provisional-label-separation=".2em" provisional-distance-between-starts="3em">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:list-block>
</xsl:template>

<xsl:template match="xhtml:dt|dt">
  <fo:list-item><xsl:call-template name="common-atts"/>
    <fo:list-item-label><fo:block/></fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>
        <xsl:apply-templates/>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
</xsl:template>

<xsl:template match="xhtml:ol|ol">
  <fo:list-block provisional-label-separation=".2em"
    provisional-distance-between-starts="{string-length(count(li))*.9+.6}em">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:list-block>
</xsl:template>

<xsl:template match="xhtml:ol/xhtml:li|ol/li">
  <fo:list-item><xsl:call-template name="common-atts"/>
    <fo:list-item-label end-indent="label-end()">
      <fo:block text-align="end">
        <xsl:variable name="value">
          <xsl:choose>
            <xsl:when test="@value"><xsl:value-of select="@value"/></xsl:when>
            <xsl:otherwise><xsl:number/></xsl:otherwise>
          </xsl:choose>
        </xsl:variable>
        <xsl:choose>
          <xsl:when test="@type='I'"><xsl:number format="I" value="$value"/></xsl:when>
          <xsl:when test="@type='A'"><xsl:number format="A" value="$value"/></xsl:when>
          <xsl:when test="@type='i'"><xsl:number format="i" value="$value"/></xsl:when>
          <xsl:when test="@type='a'"><xsl:number format="a" value="$value"/></xsl:when>
          <xsl:when test="parent::xhtml:ol/@type='I' or parent::ol/@type='I'"><xsl:number format="I" value="$value"/></xsl:when>
          <xsl:when test="parent::xhtml:ol/@type='A' or parent::ol/@type='I'"><xsl:number format="A" value="$value"/></xsl:when>
          <xsl:when test="parent::xhtml:ol/@type='i' or parent::ol/@type='I'"><xsl:number format="i" value="$value"/></xsl:when>
          <xsl:when test="parent::xhtml:ol/@type='a' or parent::ol/@type='I'"><xsl:number format="a" value="$value"/></xsl:when>
          <xsl:otherwise><xsl:number format="1" value="$value"/></xsl:otherwise>
        </xsl:choose>
        <xsl:text>.</xsl:text>
      </fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>
        <xsl:apply-templates/>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
</xsl:template>

<xsl:template match="xhtml:ul|ul|xhtml:menu|menu">
  <fo:list-block provisional-label-separation=".2em" provisional-distance-between-starts="1.6em">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:list-block>
</xsl:template>

<xsl:template match="xhtml:ul/xhtml:li|ul/li|xhtml:menu/xhtml:li|menu/li">
  <fo:list-item><xsl:call-template name="common-atts"/>
    <fo:list-item-label end-indent="label-end()">
      <fo:block text-align="end">
        <fo:inline font-family="{$font.symbol}">
          <xsl:choose>
            <xsl:when test="@type='square'"><xsl:text disable-output-escaping="yes">&amp;#x25AA;</xsl:text></xsl:when>
            <xsl:when test="@type='circle'"><xsl:text disable-output-escaping="yes">&amp;#x25CB;</xsl:text></xsl:when>
            <xsl:when test="parent::xhtml:ul/@type='square' or parent::ul/@type='square'">
              <xsl:text disable-output-escaping="yes">&amp;#x25AA;</xsl:text>
            </xsl:when>
            <xsl:when test="parent::xhtml:ul/@type='circle' or parent::ul/@type='square'">
              <xsl:text disable-output-escaping="yes">&amp;#x25CB;</xsl:text>
            </xsl:when>
            <xsl:otherwise><xsl:text disable-output-escaping="yes">&amp;#x2022;</xsl:text></xsl:otherwise>
          </xsl:choose>
        </fo:inline>
      </fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>
        <xsl:apply-templates/>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
</xsl:template>

<!-- Blocks -->

<xsl:template match="xhtml:address|address">
  <fo:block font-style="italic"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:blockquote|blockquote">
  <fo:block space-before="1em" space-after="1em" start-indent="3em" end-indent="3em">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:br|br">
  <fo:block white-space="pre"><xsl:call-template name="common-atts"/>
    <xsl:text disable-output-escaping="yes">&amp;#10;</xsl:text>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:caption|caption">
  <fo:block keep-with-next="always" text-align="center">
    <xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:center|center">
  <fo:block text-align="center">
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:div|div|xhtml:multicol|multicol|xhtml:noembed|noembed|xhtml:noframes|noframes
  |xhtml:nolayer|nolayer|xhtml:noscript|noscript">
  <fo:block><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h1|h1">
  <fo:block font-size="180%" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h2|h2">
  <fo:block font-size="160%" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h3|h3">
  <fo:block font-size="140%" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h4|h4">
  <fo:block font-size="120%" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h5|h5">
  <fo:block font-size="110%" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:h6|h6|xhtml:legend|legend">
  <fo:block font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:hr|hr">
  <fo:leader leader-pattern="rule" rule-style="groove">
    <xsl:if test="@size">
      <xsl:attribute name="rule-thickness"><xsl:value-of select="@size"/><xsl:text>pt</xsl:text></xsl:attribute>
    </xsl:if>
  </fo:leader>
</xsl:template>

<xsl:template match="xhtml:listing|listing|xhtml:plaintext|plaintext|xhtml:pre|pre|xhtml:samp|samp">
  <fo:block white-space="pre"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:p|p">
  <fo:block space-before=".6em" space-after=".6em"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<!-- Inlines -->

<xsl:template match="xhtml:abbr|abbr|xhtml:acronym|acronym">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
  <xsl:text> (</xsl:text>
  <xsl:value-of select="@title"/>
  <xsl:text>)</xsl:text>
</xsl:template>

<xsl:template match="xhtml:b|b|xhtml:strong|strong">
  <fo:inline font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:bdo|bdo">
  <fo:bidi-override direction="{@dir}"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:bidi-override>
</xsl:template>

<xsl:template match="xhtml:big|big">
  <fo:inline font-size="larger"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:blink|blink|xhtml:marquee|marquee">
  <fo:inline background-color="yellow"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:cite|cite|xhtml:dfn|dfn|xhtml:em|em|xhtml:i|i|xhtml:var|var">
  <fo:inline font-style="italic"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:code|code|xhtml:kbd|kbd|xhtml:tt|tt">
  <fo:inline font-family="monospace"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:del|del|xhtml:s|s|xhtml:strike|strike">
  <fo:inline text-decoration="line-through"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:font|font">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:choose>
      <xsl:when test="@size=1"><xsl:attribute name="font-size">xx-small</xsl:attribute></xsl:when>
      <xsl:when test="@size=2"><xsl:attribute name="font-size">x-small</xsl:attribute></xsl:when>
      <xsl:when test="@size=3"><xsl:attribute name="font-size">small</xsl:attribute></xsl:when>
      <xsl:when test="@size=4"><xsl:attribute name="font-size">medium</xsl:attribute></xsl:when>
      <xsl:when test="@size=5"><xsl:attribute name="font-size">large</xsl:attribute></xsl:when>
      <xsl:when test="@size=6"><xsl:attribute name="font-size">x-large</xsl:attribute></xsl:when>
      <xsl:when test="@size=7"><xsl:attribute name="font-size">xx-large</xsl:attribute></xsl:when>
    </xsl:choose>
    <xsl:if test="@face"><xsl:attribute name="font-family"><xsl:value-of select="@face"/></xsl:attribute></xsl:if>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:ins|ins|xhtml:u|u">
  <fo:inline text-decoration="underline"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:nowrap|nowrap">
  <fo:inline wrap-option="no-wrap"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:q|q">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:text disable-output-escaping="yes">&amp;#x201C;</xsl:text>
    <xsl:apply-templates/>
    <xsl:text disable-output-escaping="yes">&amp;#x201D;</xsl:text>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:q|q[starts-with(.,'&#x22;') or starts-with(.,'&#x201C;') or starts-with(.,'&#x201F;')]">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:small|small">
  <fo:inline font-size="smaller"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:span|span">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:sub|sub">
  <fo:inline baseline-shift="sub"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:sup|sup">
  <fo:inline baseline-shift="super"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<!-- Forms -->

<xsl:template match="xhtml:button|button">
  <fo:block background-color="silver" border="3pt outset silver" text-align="center" width="auto">
    <xsl:call-template name="common-atts"/>
    <xsl:text> </xsl:text>
    <xsl:apply-templates/>
    <xsl:text> </xsl:text>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:fieldset|fieldset">
  <fo:block border="1pt groove gray"><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:form|form">
  <fo:block><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:input|input"><!-- default input is text (also handles password & file) -->
  <fo:leader leader-pattern="rule"><xsl:call-template name="common-atts"/>
    <xsl:attribute name="leader-length">
      <xsl:choose>
        <xsl:when test="@size"><xsl:value-of select="@size"/><xsl:text>em</xsl:text></xsl:when>
        <xsl:otherwise>10em</xsl:otherwise>
      </xsl:choose>
    </xsl:attribute>
  </fo:leader>
</xsl:template>

<xsl:template match="xhtml:input[@type='checkbox']|input[@type='checkbox']">
  <fo:inline font-family="{$font.symbol}" font-size="larger"><xsl:call-template name="common-atts"/>
    <xsl:choose>
      <xsl:when test="@checked"><xsl:text disable-output-escaping="yes">&amp;#x2611;</xsl:text></xsl:when>
      <xsl:otherwise><xsl:text disable-output-escaping="yes">&amp;#x2610;</xsl:text></xsl:otherwise>
    </xsl:choose>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:input[@type='radio']|input[@type='radio']">
  <fo:inline font-family="{$font.symbol}" font-size="larger"><xsl:call-template name="common-atts"/>
    <xsl:choose>
      <xsl:when test="@checked"><xsl:text disable-output-escaping="yes">&amp;#x25C9;</xsl:text></xsl:when>
      <xsl:otherwise><xsl:text disable-output-escaping="yes">&amp;#x25CB;</xsl:text></xsl:otherwise>
    </xsl:choose>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:input[@type='button' or @type='submit' or @type='reset']
  |input[@type='button' or @type='submit' or @type='reset']">
  <fo:block background-color="silver" border="3pt outset silver" text-align="center" width="auto">
    <xsl:call-template name="common-atts"/>
    <xsl:text> </xsl:text>
    <xsl:choose>
      <xsl:when test="@value"><xsl:value-of select="@value"/></xsl:when>
      <xsl:otherwise><xsl:value-of select="@type"/></xsl:otherwise>
    </xsl:choose>
    <xsl:text> </xsl:text>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:label|label">
  <fo:inline><xsl:call-template name="common-atts"/>
    <xsl:apply-templates/>
  </fo:inline>
</xsl:template>

<xsl:template match="xhtml:select[@size=1]|select[@size=1]">
  <fo:leader leader-pattern="rule" leader-length="10em">
    <xsl:call-template name="common-atts"/>
  </fo:leader>
</xsl:template>

<xsl:template match="xhtml:select|select">
  <fo:block><xsl:call-template name="common-atts"/>
    <xsl:if test="@size">
      <xsl:attribute name="height"><xsl:value-of select="@size"/><xsl:text>em</xsl:text></xsl:attribute>
    </xsl:if>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:optgroup|optgroup">
  <fo:block font-style="italic" font-weight="bold"><xsl:call-template name="common-atts"/>
    <xsl:value-of select="@label"/>
  </fo:block>
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="xhtml:option|option">
  <fo:block><xsl:call-template name="common-atts"/>
    <xsl:if test="parent::xhtml:optgroup|parent::optgroup">
      <xsl:attribute name="start-indent">1em</xsl:attribute>
    </xsl:if>
    <xsl:choose>
      <xsl:when test="@label"><xsl:value-of select="@label"/></xsl:when>
      <xsl:otherwise><xsl:apply-templates/></xsl:otherwise>
    </xsl:choose>
  </fo:block>
</xsl:template>

<xsl:template match="xhtml:textarea|textarea">
  <fo:block border="2pt inset silver" height="{@rows}em" width="{@cols}em">
    <xsl:choose>
      <xsl:when test="node()"><xsl:apply-templates/></xsl:when>
      <xsl:otherwise><xsl:text> </xsl:text></xsl:otherwise>
    </xsl:choose>
  </fo:block>
</xsl:template>

</xsl:stylesheet>

HTTP

The purpose of this task is to execute a GET or POST HTTP call for each repository document. If the status is 200 (OK), then processing will continue,

if not, the document will be skipped.

Username/Password Credentials to access the endpoint
Action URL: The endpoint being contacted, without parameters
POST or GET: Selects the HTTP Method
Request parameters: In the form of param1=value1¶m2=value2
- The 'values' can be dynamically created using the Simflofy Expression Language
Action to take if the call does not return 200(OK): Continue, Skip or Fail are the options
Timeout in milliseconds: How long to wait for the call to execute before attempting to continue.
- If the timeout is reached, the action selected will be used.

IBM Watson Natural Language Classification

Uses IBM Watson Natural Language classifier to analyze and label text into categories. This uses Watson's Natural Language Classifier apis

The IAM API Key for IBM Cloud. See instructions here on how to get it.
The Classifier ID The ID of you classifier, created in Watson.
Field to process If left blank, the task will search for the field 'content'
Field to store the Watson response. If left blank, the field name will be 'nlc'.
Comma delimited list of file extensions Which content types to process or leave blank to process all.

Create IBM Watson Natural Language Classification Task

Index User Groups

Used for Search Security to index user and group information onto each document.

Select User Groups To Index: The selected user groups will be added to the field simflofyUserGroups as well as adding a mapping. Select Users to Index: The selected users will be to the field simflofyUsers as well as adding a mapping.

JavaScript Processing

Simflofy's JavaScript Task allows you to run JavaScript against a repository document during processing. The task utilizes Java's Nashorn engine, which uses the ECMA 5.1 specification to execute the JavaScript against the native Java API. Currently, only one root object is exposed called rd (repository document). Any method that can be called against the repository document can be called using JavaScript syntax. Fields that are also available:

Variable Name : Variable

The Job ID : jobId

The Job Run ID : jobRunId

Logger : log

info

In order to use javascript logging, you will need to set the log appendercom.simflofy.tasks to Info Debug or Trace. This can be done in Admin : Logging : Log Levels
Output from the logger methods (i.e. log.info(), log.debug(), etc.) can be found in the simflofy-admin.log file.

The variable result is automatically declared in each run of the javascript task, and its return value can be used to control which documents continue processing.

The task only supports single quotes for strings. Double quotes will cause errors due to how the content of the script is saved to the database.

Below is an example of failing and skipping documents based on their content type.

Return Value : Status

'c' : Pass

'f' : Fail

's' : Skip

Here are some of the methods the **rd** objects supports:

addSingleField(String, Object)
addArrayField(String, Object[])
addSingleDateField(String, Date)
addDateArrayField(String, Date[])
getCreatedDate()
getFileLength()
getFileName()
getLastAccessedDate()
getMd5()
getModifiedDate()
getPath()
setFileName(String)
setPath(String)
setModifiedDate(Date)
getDocumentField(String) - Produces a DocumentField object with the following methods. Returns null if the field is not present.
- getFieldName()
- getFieldValue()
  - Ex. rd.getDocumentField('myfield').getFieldValue();
- isFieldValueArray() - Check if the value of the field is an array
- getFieldValueAsArray() - Return the field value as an array of Objects (even single values)
- getFieldType() - Return the data type of the field
- toJSONObject() - returns a JSON Object containing the keys "name","type","value"
getDocumentFieldWithLowerCase(String) - Case insensitive version of getDocumentField()
getFieldAsStringHolder(String) - Returns a FieldStringValueHolder object with some utility methods for checking field content
- isEmpty() - Checks if the field is empty
- isArray() - Check if the field has multiple values
- getSingleStringValue() - Return the values as a comma delimited list
- getArrayValue() - Returns the value as a String array
getVersionInfo() - Produces a VersionInfoBean object. Null if none is present.If retrieving versions from the source repository, the task will process them in order. Setters are available for all the following as well.
- getTotalVersions() returns and [integer]
- getLabel()
- getFileName()
- getLastModified() returns a [long] in timestamp format
- getFileLength() returns a [long] in bytes
- getPath()
- isLatest() returns a [boolean]
- isMajor() returns a [boolean]
- getObjectId
getFieldsMap() - Returns a map of field names and document field objects. Below is an example script to remove fields with empty values:

var map = rd.getFieldsMap();
for(i in map){
    var val = map[i].getFieldValue()[0];
    
    if(typeof val === 'undefined' || val === ''){
        log.info('Removing ' + i);
        rd.removeField(i);
    }
}

Many more methods are available. For a full list see source (RepositoryDocument.java) (Source access required, notify support for more information.)

Jython Script Processing

The purpose of this task is to allow the user to process repository documents using the python scripting language. It does so through the Jython script executor.

The task automatically imports the sys library and sets the document to the variable rd.

The methods available are similar to those out lined in the Javascript Task.

Note that any script placed here will be run on each document individually, which may affect performance.

It is suggested that you test any script you place here before attempting an integration to ensure it is syntactically correct.

Below is an example of removing fields with empty values from the document:

map = rd.getFieldsMap()
for key, value in map.iteritems():
    if value.getFieldValue() == '':
        print 'Removing ' + key
        rd.removeField(key)

Metadata Replace Job

This task will perform a Regex search on the configured metadata field and replaces it with supplied text.

Source Metadata Field: The field to check.

Regex to Match: A regular expression to use as the first argument of a Java String.replaceAll() method.

Replacement Text: The text to use as the second argument of the replaceAll() method.

Mongo Document Metadata Join

The purposes of this task is to retrieve metadata from an outside mongo database during a migration, adding it to the repository document before mappings are performed.

This task is called a "join" because it checks for a value on both the Repository Document and in the Mongo Document.

The checked field on the Repository Document can be either a standard field mapping, or it can be calculated using the Simflofy Expression Language.
If either checked field is missing, the process will be skipped.
If the values of the checked field matches, the comma delimited list of fields will be added from the Mongo Document, in lowercase.
They can then be added as normal field mappings in a job mapping, as Field mappings take place after Tasks.

Example: ContractStatus in a Mongo Document will be mapped as constractstatus in a field mapping.

Create Mongo Document Metadata Join Task

Output Metadata As JSON

This task fully converts each document to a json object and exports to it to the configured path on the local filesystem where Simflofy is running.

Output Path: Folder where the json files will be written. Path will be created if it does not exist.

File name pattern = [filename].document.metadata.json

Folder name pattern = [foldername].folder.metadata.json

Override File Name

The task uses the Simflofy Expression Language to override the file name of each document.

Functionally identical to the Override Folder Path Task.

Deep Change: If the file has versions, update their names as well.

Override Folder Path

Add the Override Folder Path task from the drop-down on your job. Not sure how? Check here.

A sample pattern has been provided for you by default. You can also leverage the Simflofy expression language when modifying your path. More information on the Simflofy Expression Language can be found here. Click the Done button when you have finished modifying your job task.

The example pattern is

'/' + '#{rd.filename}' + '/simflofy'

Using the Simflofy Expression Language you can see that rd. are internal Simflofy Fields.

However, you may want to use metadata from the source system to generate your path.

Use field in place of rd to accomplish this or leave off the prefix entirely.

The best way to put your path together is to know what fields are available and what the field values look like. We suggest running the BFS output with no mappings and Include Un-Mapped Properties set to True. This will generate a xml file such as:

<properties>
   <entry key="document.name">Alfresco Ingestion.pptx</entry>
   <entry key="type">document</entry>
   <entry key="folderpath">testSimflofy_Partners</entry>
   <entry key="separator">,</entry>
   <entry key="document.Culture">en-US</entry>
   <entry key="document.CustomerId">123</entry>
   <entry key="document.Category">legal</entry>
   <entry key="document.lastindex">23</entry>
</properties>

Now let's say we want the actual folder path to be a combination of folderpath + Culture + Category + Customer ID. To do that we just reference each field like:

'/' + '#{rd.path}' + '/' + '#{document.Culture}'+'/' +
'#{document.Category}'+'/' + '#{document.CustomerId}'

Pause

This task can be useful when outputting to a repository with rate limits, such as Box or SharePoint.

Pause in seconds: How long to pause for each document.

PII Detection

The PII Detection Job Task is uses regex expressions to detect PII in any document or metadata passing through Simflofy. The regex expressions are stored in the form of a .properties file.

PII Flag

This task will always add the boolean field hasPii for the purposes of mapping and analysis.

Default File Location

The default file is located at simflofy-admin/WEB-INF/classes/simflofy-pii- detection.properties

PII Rules File: The rules files to use.

Field To Mark: The output metadata property to store PII detected. The value of this field will be a map

{
  "PhoneNumber": 20,
  "Names": 200
}

Fields To Check: Source properties and/or document to check for PII. Use ALL_PROPS to check all properties, BINARY to check the document (extracted via Tika) or individual property names.

Break up pii data into individuals fields: Instead of adding the PII as a map, Simflofy will break it up as individual fields for easier mapping/processing.

Prefix for pii fields: If breaking up PII data, the prefix to use for each field. If left blank 'pii' will be used.

In this case, the above fields will come across as

pii.phonenumber and pii.names

Property XML Parser Job

Field to Parse: If the content of the file is the xml to parse, use BINARY. If the xml appears in a field, use the field name.

The xml is expected in the following form:

<rootElement>
<comment>Comment Text Here</comment>
<entry key="fieldName1">fieldValue1</entry>
<entry key="fieldName2">fieldValue2</entry>
</rootElement>

The root element is named as such for the example. The task searches for "entry" children.
Comment is an optional field and will be added as comment to the document properties.

Random Date Field Populator

Adds a field with a random ISO Date Time formatted date to a repository document.

Field Name: The name of the field that will receive a random date

Start Date: The minimum time for the random date to fall between. Enter the time as an epoch timestamp.

End Date: The maximum time for the random date to fall between. Enter the time as an epoch timestamp.

Random Field Populator

Field Name: The name of the field to populate.

Field Values: Comma delimited list of values to choose from when populating the field at random.

Remove Mappings

Removes mappings from a repository document if the field has no values.

Source Fields: Comma delimited list of the mappings to remove if it has no values.

Remove Renditions Matching Binary MimeType

This task takes no arguments. It will remove all renditions that have the same MimeType as the Original Document Binary.

Rename File On Duplicate Filepath

Functions similarly to the Duplication Check Task, except if it finds a duplicate, it will rename the file using the supplied pattern.

S3 Attach Binary

Attaches a binary to the repository document from an S3 bucket.

S3 Access Key: Your access key for your S3 connection
Client Secret: Your client secret for your S3 Connection
Base Folder Path: A base path to be prepended to your key
S3 Bucket: The bucket to look for the binary in
Key Field: The field on the repo document that contains the key for S3 bucket lookup.
S3 Region: The region of your S3 bucket

Separate Binaries

This task is designed to grab a binary from the filesystem given a path set in the form field. The task will look for a file of the same name.

Expressions should be used to set the path dynamically. ie /path/to/ + #{rd.FileName}
If there is no file at the specified path, the documents will be skipped.
If there is an error reading the file, an error will be noted in the tasks/job run.

Skip Blank File Name

Skips a file during migration if its file name is blank.

No additional Task Configurations needed.

Skip On Empty Field

This task will skip a document if the supplied fields are all blank.

String to Boolean

The String to Boolean Task is designed to check the value of a string and return true of it matches the expected value. This is a quick way to convert a string field into a Boolean and is used to manipulate data as it is being processed. You also have the option to specify whether the match should be case-sensitive or not.

Here's an example:

In the above example, we are going to check the field dc:availability. If the string value of that field equals available (Expected Value) the dc:availability field will be updated to a true(boolean). As we elected not to use case sensitivity, the string AvailAble would also return true. Otherwise, a boolean false value will be stored.

Tesseract Text Extraction

Prerequisite

This task requires Tesseract to be installed on the system that Simflofy is running on.

This task uses Tesseract OCR to scan for text from images and PDF files, saving that text to a field in the repository documented called simflofy_ai_texts. Supported formats are .png, .jpg, .pdf, .tiff, .gif, and .bmp. PDFs are saved on a per-page basis to simflofy_ai_texts.

Configuration

Tessdata Directory: The path to your tessdata folder. This folder should have the trained data of the language you plan to OCR.
Tesseract Library: The path to your Tesseract library folder containing the proper library files for your OS.
Engine Mode: Select which engine Tesseract should use, legacy or LTSM. Ensure that ensure is installed before selecting it, or leave it on the default config for it to detect your engine.
Page Segmentation Mode: By default Tesseract expects a page of text. You can change the way it segments a page if your images differ from this.
Tesseract Language Code: The language code for the installed trained data in your Tessdata directory. This is in ISO 639-1/T format and is the letters before the .traineddata extension for the trained data file.
Use HOCR: Whether to use HOCR. When enabled, text will be output in HTML format rather than as raw text.

Text to PDF

This task takes no arguments, and simply converts text binaries from a Repository Document into a PDF file on output. If the mimetype of the document is not 'text/plain', it will be skipped.

Thumbnail

Generates a thumbnail for a repository document and adds it as a rendition.

Supported File Types

This task currently only supports jpg, png, bmp, wbmp, and gif

Resize Width: The width of the thumbnail image in pixels
Action: The integration action to take following thumbnail generation
Types of Thumb Nails: List of comma delimited file extensions to support

Tika Text Extractor

Apache Tika is an open-source tool used to extract text from documents. Simflofy most commonly uses it to extract text during indexing for federated search.

Tika Content Field: The field where the task will put the extracted content.

Max Content Length (B): Content length, checked before processing. Only extract from documents with length under this value. Set to 0 to process documents of any length.

Content Length to Retrieve (B): The amount of document content to extract. -1 for the entire document.

Fail on Extract: If the extraction fails, fail the document.

Remove Binary: Remove the binary from the documents. Will happen even if document exceeds maximum length.

Stage binary on filesystem: For lower memory systems, stages to the filesystem instead of in memory.

File Extensions to Extract: Comma delimited list of extensions.Extensions are checked at the same time as content length.

Two Way Sync

The Simflofy Two Way Sync Job Task is to filter out any unnecessary documents when doing an incremental sync between two systems. Documents can be filtered by matching MD5 Hash or If the document was just seen in the last run by the other sync Job.

Configuration:

For Two Way Sync you will need to set up two Incremental Migration jobs and chain one Job after the other. You will want to put the Job that should win on a collision as the first Job in the chain.

info

On each Job add the TWO Way Sync Job Task and, if using the MD5 Hash option, you will need to add the MD5 Hash Job Task before the Two Way Sync Job Task.

Check Last Run: If this document was just moved by the other Job in the Sync to this Repository, then let's skip it. If you want to sync it back (maybe after a rule/formula set to false).

Compare MD5 Checksums: Check MD5 Hash of this Document and the Last Document moved by the synced Job. If set to true and MD5 Hashes match, then skip this Document.

Job ID of Other Job: The Job ID of the other chained Job in this Two Way Sync

Watson Image Analysis

Uses IBM Watson Image Analysis to analyze an image, adding its response to a specified field.

The IAM API: Key for IBM Cloud. See instructions here on how to get it.
Collection ID: If left blank, Simflofy will create one
Collection Name: Name of the collection to create. If left blank, will be new-training-collection with the date appended.
Training Mode: Images will be used in conjunction with the training json to train your analyzer. In this mode, no analysis will be performed.
The training json: Required for training mode. The format is {"dog":[25,35,105,215]}. Array order is top, left, width, height with object type as the key.
Field that contains the image urls. If left blank the file content will be used.

ACL Converter​

Alfresco Property Mapping Nodes​

Alfresco Job Run History Nodes​

Apply Security Marks​

AWS Rekognition Image​

AWS Textract​

Basic JDBC Job​

caution

Buffer Binaries to File System​

Clean Array Value To String​

CMIS ACL modification​

Convert To UTC​

Create Preview File​

Date Based Folder Path​

Example

Duplication Check​

EML Metadata Extraction​

Extract Metadata From Path​

Example

Field Lookup​

File Name Extraction​

Example - Alphanumerics

Example - Clearing Unwanted Spaces

Filename Cleanse​

Example - Alphanumerics

Example - Clearing Unwanted Spaces

FileNet ACL Modification​

Deprecation

Filesystem ACL Extraction​

Filter Expression​

Folder Path Cleanse​

FTP​

Generic ACL Mapper​

Limited Usage

Google Vision Image Labels​

Google Vision Text Extraction​

Hash Value Generator​

HTML to PDF​

HTTP​

IBM Watson Natural Language Classification​

Index User Groups​

JavaScript Processing​

info

Jython Script Processing​

Metadata Replace Job​

Mongo Document Metadata Join​

Output Metadata As JSON​

Override File Name​

Override Folder Path​

Pause​

PII Detection​

PII Flag

Default File Location

Property XML Parser Job​

Random Date Field Populator​

Random Field Populator​

Remove Mappings​

Remove Renditions Matching Binary MimeType​

Rename File On Duplicate Filepath​

S3 Attach Binary​

Separate Binaries​

Skip Blank File Name​

Skip On Empty Field​

String to Boolean​

Tesseract Text Extraction​

Prerequisite

Text to PDF​

Thumbnail​

Supported File Types

Tika Text Extractor​

Two Way Sync​

info

Watson Image Analysis​

ACL Converter

Alfresco Property Mapping Nodes

Alfresco Job Run History Nodes

Apply Security Marks

AWS Rekognition Image

AWS Textract

Basic JDBC Job

Buffer Binaries to File System

Clean Array Value To String

CMIS ACL modification

Convert To UTC

Create Preview File

Date Based Folder Path

Duplication Check

EML Metadata Extraction

Extract Metadata From Path

Field Lookup

File Name Extraction

Filename Cleanse

FileNet ACL Modification

Filesystem ACL Extraction

Filter Expression

Folder Path Cleanse

FTP

Generic ACL Mapper

Google Vision Image Labels

Google Vision Text Extraction

Hash Value Generator

HTML to PDF

HTTP

IBM Watson Natural Language Classification

Index User Groups

JavaScript Processing

Jython Script Processing

Metadata Replace Job

Mongo Document Metadata Join

Output Metadata As JSON

Override File Name

Override Folder Path

Pause

PII Detection

Property XML Parser Job

Random Date Field Populator

Random Field Populator

Remove Mappings

Remove Renditions Matching Binary MimeType

Rename File On Duplicate Filepath

S3 Attach Binary

Separate Binaries

Skip Blank File Name

Skip On Empty Field

String to Boolean

Tesseract Text Extraction

Text to PDF

Thumbnail

Tika Text Extractor

Two Way Sync

Watson Image Analysis