Skip to main content

Box to ElasticSearch Integration Tutorial

Overview

In this tutorial we will walk through how to set up a job in Simflofy that will index the content in your Box repository into your Elasticsearch system so that you can view and manage your documents in our TSearch platform.

Step 1. Setting up your Box Repository Authentication Connection.

This will allow simflofy to connect to your Box Repository.

To set up your Box Authentication Connection

  1. Select Authentication from the Connections section of the navigation menu.
  2. Click the Create New Auth Connection button.
  3. In the New Auth Connection form
    1. Name your connection. In this example we will use cs-Box-Box JWT Auth Connector
    2. Select the connection type which for this example will be Box JWT Auth Connector
  4. Next edit your new Box Authentication Connection with the following input
    1. In the JWT JSON field you will enter the JSON text you received from your Box Developer Application.
    2. Change the number of app users to 5
    3. Leave the rest of the fields as is and click the Save button at the top of the page to save your changes and return to the Authentication Connections Page

Step 2. Set up your Elasticsearch Output Connection.

This will allow Simflofy to connect to your ElasticSearch Repository.

To set up your Elasticsearch Output Connection:

  1. Click the Create New Auth Connection button.
  2. In the New Auth Connection form
    1. Name your connection. In this example we will use rm-Elastic Search Authentication Connector
    2. Select the connection type. In this case we will use Elasticsearch Authentication Connector
  3. In the next page we will edit your new Elasticsearch connection
    1. In the server URL field enter the URL for your Elasticsearch Server. For example: http://127.0.0.1:9200/
    2. Leave the other fields as is and click on the Save button to save these changes.

Step 3. Set up your Repository Integration Connection.

Using this connection Simflofy will generate a query, or use one provided, to retrieve unique ids for documents.

To set up your Box Integration Connection:

  1. Select Integration from the Connections section of the navigation menu.
  2. Click the Create Integration Connection button.
  3. In the New Integration Form
    1. Name your integration connection. In this example we will use the name rm-box
    2. For the connection type choose Box Connector from the drop-down list. You can also begin typing Box in the search field to filter the drop-down list and then select it.
    3. Click save to edit this new integration connection
  4. In the Edit Connection: RM-BOX page enter the following
    1. Give your connection a description. Example: Box Connector
    2. Choose the Authentication Connection you created in Step 1. cs-Box-Box JWT Auth Connector
    3. Leave the rest of the fields as is and click Save to lock in your changes and return to the Integration Connections page

Step 4. Set up your Output Integration Connection

In Output mode, connectors push content and metadata. Many of them can also build version series' from the source systems.

To set up your Elasticsearch Integration Connection:

  1. At the bottom of the Integration Connections page click the Create Integration Connection button.
  2. In the New Integration Form
    1. Name your integration connection. Here we will use the name rm-ElasticSearch
    2. Chose the Elasticsearch Authentication Connection we created in Step 2. rm-Elastic Search Authentication Connector
    3. Leave the rest of the fields as is and click Save to finish creating this connection and return to the Integration Connections page.

Step 5. Create your Content Service Connection.

Simflofy Content Service connections offer public REST endpoints that allow for integration with external applications.

To create a content service connection for your Box Repository:

  1. Select Content Service from the Connections section of the navigation menu
  2. At the bottom of the Content Service Connection page click the Create New Content Service Connection button
  3. In the new Content Service Connection page enter the following information
    1. Connector ID: name your content service connection for this example we will use the name box
    2. Description: Give your connection a description. Here we will use Box Content Service
    3. Type: The type for this connection will be Box Content Service Connector
    4. Security Mode: Choose Authentication Connection as the security mode. And select the Box Authentication Connection we created in Step 1 from the dropdown cs-Box-Box JWT Auth Connector
    5. Leave all other fields as is and click Save to finish the creation of this connection and return to the Content Service Connection Page

Step 6. Create a Job Mapping for your Content Service Connection

Content mappings will allow you to map custom parameters to properties in the destination system.

To create a Job Mapping for ElasticSearch:

  1. Under the Integrations section of the navigation menu select Job Mappings
  2. At the bottom of the Job Mappings page click the Create New Job Mapping button
  3. Name your Job Mapping. Here we will use Basic ElasticSearch Mapping
  4. Enter the following mapping
    1. Source: content
    2. Target: content
    3. Type: String
  5. Click Add New Mapping
    1. Select Field Mapping from the drop-down
  6. Click Save to save this mapping for use in your Job Configuration.

Step 7. Set up your integration job.

This will integrate your Box repository with your Elasticsearch repository. Allowing you to view your Box files in TSearch through your Elasticsearch connection

To set up your integration job:

  1. Select List Jobs from the Integration section of the navigation menu
  2. At the bottom of the Jobs page click the button Create New Job
  3. In the Create New Job Form
    1. Give your new job a name. For this example we will use RM-Box to RM-ElasticSearch
    2. For Repository Connection select the Box Integration connection we created in Step 3 from the drop-down. rm-Box
    3. For Output Connection select the Elasticsearch Connection we created in Step 4 from the drop-down. rm-ElasticSearch
    4. Leave the other fields as is and click save to continue setting up this new integration job.
  4. In the Details Tab
    1. Under Content Service Connector, add the Box Content Service Connection you created in Step 5. Box
  5. In the Tasks Tab we will be adding the Default Tika Extractor Task.
    1. Select Tika Extractor Task from the dropdown
    2. Click the plus button to edit the task properties
      1. Uncheck the Fail Document on Extract Error box
      2. Uncheck the Remove Binary After Extraction box
      3. Leave all other fields as is and select Done
  6. In the Mappings Tab under the Select Additional Mappings section
    1. Select the Job Mapping we created in Step 6. Basic ElasticSearch Mapping
  7. The RM-Box tab is where you will add any necessary configurations for the Box Integration connection you are using as the repository.
    1. Box Query Tab:
      1. Enter your Box folder ID. Example: 129069155071
      2. Leave all other fields as is
    2. Repository Crawl Tab:
      1. Make sure Retrieve folders is checked
      2. Leave all other fields as is
  8. The RM-Elasticsearch Tab is where you will enter any additional settings needed to use Elasticsearch as your Output integration connection under the Server Tab.
    1. Under Index Name add the index where the documents will be stored. In this example we will use the index rm
    2. Under Batch size we will change this count to 20
    3. Leave all other fields as is
  9. Click Save to save your job configurations and return to the Jobs page

Step 8. Run and Monitor the job

This will integrate the chosen content from your Box repository to your Elasticsearch repository allowing you to view the content in the TSearch Platform.

To run and monitor this job:

  1. Select Run and Monitor Jobs from the Integration section of the navigation menu
  2. Find the job created in Step 7. RM-Box to RM-ElasticSearch
  3. Select the triangle next to the job to run this Integration job.

Depending on how many files you have stored this could take more than a few minutes. To monitor the progress of this integration you can click the Refresh button at the top of the page. You can also set Simflofy to automatically refresh every 30 seconds, every minute, or every 5 minutes.

Once the integration is complete the status will change to green and state Complete.

Congratulations! You have successfully integrated your Box Cloud Content into ElasticSearch!