Lucidworks Fusion 2.1 datasource connector to SQL returns only one document

Issue

Been working with Lucidworks Fusion 2.1 search engine platform lately. Really cool stuff; worthy of respect. One of the first steps in using it is to crawl content from a data source. It can crawl just about anything.

Trying to crawl a MS SQL table recently threw a queer error that took a bit more than usual to figure out. The settings were straightforward enough, and the job would look like it was finding content, but the job would return only one document (that’s “row” in db-speak).

Analysis

It wasn’t anything to do with the JDBC driver, permissions or the SQL select statement which had a WHERE clause. It was an issue with one of the fields. By testing the query one line at a time, all string data types returned all records, but one number field consistently caused the job to fail.

The solr.log explained the issue:

2015-12-09T11:50:40,786 – ERROR [qtp355629945-19:SolrException@139] – {collection=dg, core=dg_shard1_replica1, node_name=10.246.71.13:8983_solr, replica=core_node1, shard=shard1} – org.apache.solr.common.SolrException: ERROR: [doc=a596f774-7289-4d03-802b-e1ca4945c9bc] Error adding field ‘NPINumber’=’1.174598122E9’ msg=For input string: “1.174598122E9″

It was a float field in the data source, so that’s what Fusion would try to return and fail. Why was the number a float? Who knows.

Solution

Changing it to a “int” fixed the problem. That’s really what the field was anyhow – that or a string in this case.

Fusion datasource job status

Fusion datasource job status

Advertisements