Skip to main content

Apache Solr

End of Life

While Simflofy still supports Solr for some basic federation cases it will not receive any further enhancements beyond its current state and does not support Records Management

Solr is an open-source enterprise-search platform, written in Java, from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document handling.

Following are the configuration fields for Simflofy Solr Integration and Search Connections

Solr Core Creation

Simflofy has no way of creating a solr core or generating a solr schema. The configured Solr core must already exist. For the simplest case, go to your solr/bin directory and execute the command.

solr -c create _core-name_

Any attempts to authenticate without a valid core will fail


Authentication Connection

Authentication connectors are used to authenticate repository/output connections that need certain authentication fields like access tokens or refresh tokens. Click here for more information on setting up an Authentication Connection.

Authentication Connection Fields

  • Name: The name of the connection
  • Host: The location of the solr server. Can support multiple urls, comma delimited, if using Solr Cloud
  • Solr Core Name: The name of the Solr core (collection) to authenticate to.
  • Username: (Optional) The username, if security is enabled
  • Password: (Optional) The password, if security is enabled
  • Use SolrCloud: A special high-availability set up for clustered solr servers.
  • Use ZooKeeper: Another apache product used for clustering servers.

Discovery Connector

note

There is no Discovery Schema available for Apache Solr


Integration Connection

The Solr Integration Connection is designed to index content into a solr core

Integration Connection Fields

  • Connection Name: This is a unique name given to the connector instance upon creation.
  • Description: A description of the connector to help identify it better.

Job Configuration

A Simflofy Job is the process of moving or syncing content (including versions, ACL's, metadata) from one CMS (content management system) to another. Click here for details on how to set up an integration job.

Solr Output Specification Fields

  • Solr Core Name: The name of the solr core to write to.
  • ID Attribute: The attribute to be used for the id field
  • Autocommit: Simflofy will not send a commit command and will let the server decide when to commit new documents. Committing can affect server performance
  • Enable Wait Flush: Part of a manual commit call. From Solrs' documentation: Block until index changes are flushed to disk
  • Enable Wait Searcher: Part of a manual commit call. From Solrs' documentation: Block until a new searcher is opened and registered as the main query searcher, making the changes visible
  • Enable Soft Commit: Part of a manual commit call. From Solrs' documentation: Makes index changes visible while neither fsync-ing index files nor writing a new index descriptor
  • Commit frequency: If not auto-committing, how many documents to send before making a commit call. Default (-1) will be set to 100.
  • Solr Cloud Queue Size: If using Solr Cloud, the size of the queue.
  • Term Vector Field: Required for More Like This searches.

Repository Specifications

note

Solr Cannot be set up as a repository source.


Mappings

Simflofy uses Solr's "schemaless" mode, which update a core's schema automatically based on the first occurrence of a field. We use suffixes to tell Solr what type of field should be mapped. Use the table below as a guide.

If you wish to not append suffixes to your field names, you will need to update the core's schema.xml file inside the solr application.

Suffix (Single)Suffix (multi)TypeDescription
_t_txttext_generalIndexed for full-text search so individual words or phrases may be matched.
_s_ssstringA string value is indexed as a single unit. This is good for sorting, faceting, and analytics. It’s not good for full-text search.
_i_isinta 32-bit signed integer
_l_lslonga 64-bit signed long
_f_fsfloatIEEE 32 bit floating point number (single precision)
_d_dsdoubleIEEE 64 bit floating point number (double precision)
_b_bsbooleantrue or false
_dt_dtsdateA date in Solr’s date format
_pNAlocationA latitude and longitude pair for geo-spatial search

Click here for more information on Mappings


Content Search Connector

Search connectors are also called View Connectors. You can manage them in the Content Service menu by clicking on Content View Connections.When creating a content view, select the Solr Content View Connector. To see how to configure the basic options, see the Content View (Search) Connector page.

Search Configuration Fields

  • Collection: Name of the Solr core (collection) you are connecting to.
  • Result Link: Edit,External Link,Download, Inline.
  • Face Fields: List of fields that will be used for faceted search
  • Search for default metadata fields:
  • Field List: List of fields to be part of search
  • Facet Limit: 0 means let Solr decide. Otherwise, set to 1 or more. Default is typically 20, but check your solr server to be sure.
  • Facet Minimum Count: Minimum facets needed to return a result.
  • Facet Date Field: Date field used for Facets
  • Facet Date Start: Start Date for Facets
  • Facet Date End: End Date for Facets
  • Facet Date Gap: Simple Facet Parameters
  • Highlight: Yes or No to turn off or on highlighting.
  • Highlight Fields: Which fields will we return with highlights?
  • Highlight Field Length: The length of the highlighted field result. Default is 300, but many users like to set this to 500 or more.
  • Stats: Gather usage stats for this connector?
  • Public Search: Is this a public search page or behind an authentication wall?
External Links

If you choose External Link for the Result Link, you'll need to configure the URL by clicking the "Add External Link" button at the bottom of the page

Content Service Connector: The unique name of the content service connector this is associated with. You typically have an External Link per content service that is part of the index.

Link Field: The field that will be appended to the URL

Link URL: The URL that will be used. The Link Field will be appended to this field for each result.


Looking to integrate Solr? We can help.