Start Up Fusion Solr as Non-root User in CentOS 6.6

One of the basic things to do when installing and configuring Lucidworks Fusion 2.x on a single-server using the Solr instance included in the Fusion distribution is getting all Fusion services to start up on their own on a server reboot.

Lucidworks included the fundamentals for how to do this in their Fusion 2.1 User Guide, but the instructions assume you have Ubuntu Upstart Scripts at your fingertips, which are not out of the box in CentOS 6.6 or RH6.5 distros.

Fusion needs to be started as a non-root user, kinda like other web services like Tomcat. It’s got really simple start, stop, restart and status commands, but they are not based on *.sh files.

Here are the steps that worked for me.

Create init script

sudo nano /etc/init.d/fusion

Add commands to script

#!/bin/bash
# description: Fusion Startup
# processname: fusion
# chkconfig: 234 20 80
# by max.derungs@providence.org

FUSION_CMD=/opt/fusion/bin/fusion

# Source the function library for daemon.
. /etc/init.d/functions

# Summon the daemons.
case "$1" in
start)
    daemon --check fusion $FUSION_CMD start
;;
stop)
    daemon --check fusion $FUSION_CMD stop
;;
status)
    daemon --check fusion $FUSION_CMD status
;;
restart)
    daemon --check fusion $FUSION_CMD restart
;;
*)
echo $"Usage: $0 {start|stop|restart|status}"
esac
exit 0

Set permissions of and make executable

sudo chmod 755 /etc/init.d/fusion

Setup chkconfig utility to start service at boot time

sudo /sbin/chkconfig --add fusion
sudo /sbin/chkconfig --level 234 fusion on
sudo /sbin/chkconfig --list fusion

Test

sudo service fusion start

Lucidworks Fusion 2.1 datasource connector to SQL returns only one document

Issue

Been working with Lucidworks Fusion 2.1 search engine platform lately. Really cool stuff; worthy of respect. One of the first steps in using it is to crawl content from a data source. It can crawl just about anything.

Trying to crawl a MS SQL table recently threw a queer error that took a bit more than usual to figure out. The settings were straightforward enough, and the job would look like it was finding content, but the job would return only one document (that’s “row” in db-speak).

Analysis

It wasn’t anything to do with the JDBC driver, permissions or the SQL select statement which had a WHERE clause. It was an issue with one of the fields. By testing the query one line at a time, all string data types returned all records, but one number field consistently caused the job to fail.

The solr.log explained the issue:

2015-12-09T11:50:40,786 – ERROR [qtp355629945-19:SolrException@139] – {collection=dg, core=dg_shard1_replica1, node_name=10.246.71.13:8983_solr, replica=core_node1, shard=shard1} – org.apache.solr.common.SolrException: ERROR: [doc=a596f774-7289-4d03-802b-e1ca4945c9bc] Error adding field ‘NPINumber’=’1.174598122E9’ msg=For input string: “1.174598122E9″

It was a float field in the data source, so that’s what Fusion would try to return and fail. Why was the number a float? Who knows.

Solution

Changing it to a “int” fixed the problem. That’s really what the field was anyhow – that or a string in this case.

Fusion datasource job status

Fusion datasource job status