Box to ElasticSearch Integration Tutorial
Overview
In this tutorial we will walk through how to set up a job in Simflofy that will index the content in your Box repository into your Elasticsearch system so that you can view and manage your documents in our TSearch platform.
Step 1. Setting up your Box Repository Authentication Connection.
This will allow simflofy to connect to your Box Repository.
To set up your Box Authentication Connection
Select Authentication from the Connections section of the navigation menu.
Click the Create New Auth Connection button.
In the New Auth Connection form enter the following configurations
Name: cs-Box-Box JWT Auth Connector
Connection Type: Box JWT Auth ConnectorNext edit your new Box Authentication Connection Configurations
JWT JSON: JSON text you received from your Box Developer Application
Number of App Users: 5
Leave the rest of the fields as isClick Save
Step 2. Set up your Elasticsearch Output Connection.
This will allow Simflofy to connect to your ElasticSearch Repository.
To set up your Elasticsearch Output Connection:
Click the Create New Auth Connection button.
In the New Auth Connection form, enter the following configurations.
Name: rm-Elastic Search Authentication Connector
Connection Type: Elasticsearch Authentication ConnectorIn the next page we will edit your new Elasticsearch connection configurations
Server URL: http://127.0.0.1:9200/
Leave the other fields as isClick Save
Step 3. Set up your Repository Integration Connection.
Using this connection Simflofy will generate a query, or use one provided, to retrieve unique ids for documents.
To set up your Box Integration Connection:
Select Integration from the Connections section of the navigation menu.
Click the Create Integration Connection button.
In the New Integration Form, enter the following configurations
Name: rm-box
Connection: Box ConnectorClick save
In the Edit Connection: RM-BOX page enter the following configurations
Description: Box Connector
Authentication Connection: cs-Box-Box JWT Auth Connector
Leave the rest of the fields as isClick Save
Step 4. Set up your Output Integration Connection
In Output mode, connectors push content and metadata. Many of them can also build version series' from the source systems.
To set up your Elasticsearch Integration Connection:
At the bottom of the Integration Connections page click the Create Integration Connection button.
In the New Integration Form, enter the following configurations:
Name: rm-ElasticSearch
Authentication Connection: rm-Elastic Search
Leave the rest of the fields as isClick Save
Step 5. Create your Content Service Connection.
Simflofy Content Service connections offer public REST endpoints that allow for integration with external applications.
To create a content service connection for your Box Repository:
- Select Content Service from the Connections section of the navigation menu
- At the bottom of the Content Service Connection page click the Create New Content Service Connection button
- In the new Content Service Connection page enter the following information
- Connector ID: name your content service connection for this example we will use the name box
- Description: Give your connection a description. Here we will use Box Content Service
- Type: The type for this connection will be Box Content Service Connector
- Security Mode: Choose Authentication Connection as the security mode. And select the Box Authentication Connection we created in Step 1 from the dropdown cs-Box-Box JWT Auth Connector
- Leave all other fields as is and click Save to finish the creation of this connection and return to the Content Service Connection Page
Step 6. Create a Job Mapping for your Content Service Connection
Content mappings will allow you to map custom parameters to properties in the destination system.
To create a Job Mapping for ElasticSearch:
- Under the Integrations section of the navigation menu select Job Mappings
- At the bottom of the Job Mappings page click the Create New Job Mapping button
- Name your Job Mapping. Here we will use Basic ElasticSearch Mapping
- Enter the following mapping
- Source:
content
- Target:
content
- Type:
String
- Source:
- Click Add New Mapping
- Select Field Mapping from the drop-down
- Click Save to save this mapping for use in your Job Configuration.
Step 7. Set up your integration job.
This will integrate your Box repository with your Elasticsearch repository. Allowing you to view your Box files in TSearch through your Elasticsearch connection
To set up your integration job:
- Select List Jobs from the Integration section of the navigation menu
- At the bottom of the Jobs page click the button Create New Job
- In the Create New Job Form
- Give your new job a name. For this example we will use RM-Box to RM-ElasticSearch
- For Repository Connection select the Box Integration connection we created in Step 3 from the drop-down. rm-Box
- For Output Connection select the Elasticsearch Connection we created in Step 4 from the drop-down. rm-ElasticSearch
- Leave the other fields as is and click save to continue setting up this new integration job.
- In the Details Tab
- Under Content Service Connector, add the Box Content Service Connection you created in Step 5. Box
- In the Tasks Tab we will be adding the Default Tika Extractor Task.
- Select Tika Extractor Task from the dropdown
- Click the plus button to edit the task properties
- Uncheck the Fail Document on Extract Error box
- Uncheck the Remove Binary After Extraction box
- Leave all other fields as is and select Done
- In the Mappings Tab under the Select Additional Mappings section
- Select the Job Mapping we created in Step 6. Basic ElasticSearch Mapping
- The RM-Box tab is where you will add any necessary configurations for the Box Integration connection you are using as the repository.
- Box Query Tab:
- Enter your Box folder ID. Example: 129069155071
- Leave all other fields as is
- Repository Crawl Tab:
- Make sure Retrieve folders is checked
- Leave all other fields as is
- Box Query Tab:
- The RM-Elasticsearch Tab is where you will enter any additional settings needed to use Elasticsearch as your Output integration connection under the Server Tab.
- Under Index Name add the index where the documents will be stored. In this example we will use the index rm
- Under Batch size we will change this count to 20
- Leave all other fields as is
- Click Save to save your job configurations and return to the Jobs page
Step 8. Run and Monitor the job
This will integrate the chosen content from your Box repository to your Elasticsearch repository allowing you to view the content in the TSearch Platform.
To run and monitor this job:
- Select Run and Monitor Jobs from the Integration section of the navigation menu
- Find the job created in Step 7. RM-Box to RM-ElasticSearch
- Select the triangle next to the job to run this Integration job.
Depending on how many files you have stored this could take more than a few minutes. To monitor the progress of this integration you can click the Refresh button at the top of the page. You can also set Simflofy to automatically refresh every 30 seconds, every minute, or every 5 minutes.
Once the integration is complete the status will change to green and state Complete.