Apache Solr
End of Life
While Simflofy still supports Solr for some basic federation cases it will not receive any further enhancements beyond its current state and does not support Records Management
Solr is an open-source enterprise-search platform, written in Java, from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document handling.
Following are the configuration fields for Simflofy Solr Integration and Search Connections
Solr Core Creation
Simflofy has no way of creating a solr core or generating a solr schema. The configured Solr core must already exist. For the simplest case, go to your solr/bin directory and execute the command.
solr -c create _core-name_
Any attempts to authenticate without a valid core will fail
Authentication Connection
Authentication connectors are used to authenticate repository/output connections that need certain authentication fields like access tokens or refresh tokens. Click here for more information on setting up an Authentication Connection.
Authentication Connection Fields
- Name: The name of the connection
- Host: The location of the solr server. Can support multiple urls, comma delimited, if using Solr Cloud
- Solr Core Name: The name of the Solr core (collection) to authenticate to.
- Username: (Optional) The username, if security is enabled
- Password: (Optional) The password, if security is enabled
- Use SolrCloud: A special high-availability set up for clustered solr servers.
- Use ZooKeeper: Another apache product used for clustering servers.
Discovery Connector
note
There is no Discovery Schema available for Apache Solr
Integration Connection
The Solr Integration Connection is designed to index content into a solr core
Integration Connection Fields
- Connection Name: This is a unique name given to the connector instance upon creation.
- Description: A description of the connector to help identify it better.
Job Configuration
A Simflofy Job is the process of moving or syncing content (including versions, ACL's, metadata) from one CMS (content management system) to another. Click here for details on how to set up an integration job.
Solr Output Specification Fields
- Solr Core Name: The name of the solr core to write to.
- ID Attribute: The attribute to be used for the id field
- Autocommit: Simflofy will not send a commit command and will let the server decide when to commit new documents. Committing can affect server performance
- Enable Wait Flush: Part of a manual commit call. From Solrs' documentation: Block until index changes are flushed to disk
- Enable Wait Searcher: Part of a manual commit call. From Solrs' documentation: Block until a new searcher is opened and registered as the main query searcher, making the changes visible
- Enable Soft Commit: Part of a manual commit call. From Solrs' documentation: Makes index changes visible while neither fsync-ing index files nor writing a new index descriptor
- Commit frequency: If not auto-committing, how many documents to send before making a commit call. Default (-1) will be set to 100.
- Solr Cloud Queue Size: If using Solr Cloud, the size of the queue.
- Term Vector Field: Required for More Like This searches.
Repository Specifications
note
Solr Cannot be set up as a repository source.
Mappings
Simflofy uses Solr's "schemaless" mode, which update a core's schema automatically based on the first occurrence of a field. We use suffixes to tell Solr what type of field should be mapped. Use the table below as a guide.
If you wish to not append suffixes to your field names, you will need to update the core's schema.xml file inside the solr application.
Suffix (Single) | Suffix (multi) | Type | Description |
---|---|---|---|
_t | _txt | text_general | Indexed for full-text search so individual words or phrases may be matched. |
_s | _ss | string | A string value is indexed as a single unit. This is good for sorting, faceting, and analytics. Itβs not good for full-text search. |
_i | _is | int | a 32-bit signed integer |
_l | _ls | long | a 64-bit signed long |
_f | _fs | float | IEEE 32 bit floating point number (single precision) |
_d | _ds | double | IEEE 64 bit floating point number (double precision) |
_b | _bs | boolean | true or false |
_dt | _dts | date | A date in Solrβs date format |
_p | NA | location | A latitude and longitude pair for geo-spatial search |
Click here for more information on Mappings
Content Search Connector
Search connectors are also called View Connectors. You can manage them in the Content Service menu by clicking on Content View Connections.When creating a content view, select the Solr Content View Connector. To see how to configure the basic options, see the Content View (Search) Connector page.
Search Configuration Fields
- Collection: Name of the Solr core (collection) you are connecting to.
- Result Link: Edit,External Link,Download, Inline.
- Face Fields: List of fields that will be used for faceted search
- Search for default metadata fields:
- Field List: List of fields to be part of search
- Facet Limit: 0 means let Solr decide. Otherwise, set to 1 or more. Default is typically 20, but check your solr server to be sure.
- Facet Minimum Count: Minimum facets needed to return a result.
- Facet Date Field: Date field used for Facets
- Facet Date Start: Start Date for Facets
- Facet Date End: End Date for Facets
- Facet Date Gap: Simple Facet Parameters
- Highlight: Yes or No to turn off or on highlighting.
- Highlight Fields: Which fields will we return with highlights?
- Highlight Field Length: The length of the highlighted field result. Default is 300, but many users like to set this to 500 or more.
- Stats: Gather usage stats for this connector?
- Public Search: Is this a public search page or behind an authentication wall?
External Links
If you choose External Link for the Result Link, you'll need to configure the URL by clicking the "Add External Link" button at the bottom of the page
Content Service Connector: The unique name of the content service connector this is associated with. You typically have an External Link per content service that is part of the index.
Link Field: The field that will be appended to the URL
Link URL: The URL that will be used. The Link Field will be appended to this field for each result.
Looking to integrate Solr? We can help.