Jul. 18, 2018
Karan Jeet Singh
|
Importing Data from PostgreSQL into Solr is part 2 of our 3-part series about importing data from our Solutions Engineer, Karan Singh. Part 1 covered Importing Data into Apache Solr.
Importing data from PostgreSQL into Solr is quite straightforward and we can use Data Import Handler to do it for us.
PostgreSQL supports JDBC, so configuring Data Import Handler only requires following:
Here are the steps to set up the Data Import Handler for PostgreSQL:
Add following tags under the <config> tag so that Solr picks up the Custom JARs
<lib dir=”${solr.install.dir:../../../..}/contrib/dataimporthandler/lib” regex=”.*\.jar” />
<lib dir=”${solr.install.dir:../../../..}/dist/” regex=”solr-dataimporthandler-.*\.jar” />
Define /dataimport request handler
<!– A request handler for data import handler –>
<requestHandler name=”/dataimport”
class=”org.apache.solr.handler.dataimport.DataImportHandler”>
<lst name=”defaults”>
<str name=”config”>data-config.xml</str>
</lst>
</requestHandler>
2. Create data-config.xml file in the same folder as solrconfig.xml, this file will contain the connection string, the query to get data, and the details about incoming fields.
<dataConfig>
<dataSource type=”JdbcDataSource” name=”imdb-title-rating” driver=”org.postgresql.Driver” url=”jdbc:postgresql://127.0.0.1:5432/imdb” user=”postgrestest” password=”test” />
<document name=”title_rating”>
<entity name=”rating” query=”SELECT * FROM title_ratings;”>
<field column=”tconst” name=”tconst” />
<field column=”averagerating” name=”averageRating” />
<field column=”numvotes” name=”numVotes” />
</entity>
</document>
</dataConfig>
3. Edit managed-schema file, or if it doesn’t exist, then create schema.xml file. The incoming fields defined in Step 2 need to be made recognizable by Solr. Every incoming field needs to be map it to the Solr recognizable datatype, so that it can parse the data.
When editing managed-schema file, then add the fields defined above –
<field name=”tconst” type=”string” indexed=”true” stored=”true” />
<field name=”averageRating” type=”tint” indexed=”true” stored=”true” />
<field name=”numVotes” type=”int” indexed=”true” stored=”true” />
<field name=”id” type=”string” indexed=”true” stored=”true” multiValued=”false” />
4. Download and copy jdbc driver jar for postgres to “contrib/dataimporthandler/lib” folder
SearchStax engineers have helped our clients import data from different sources into Apache Solr. If you’re interested in learning more about our hosted Solr solution and support, please reach out.
Using DIH requires following important configuration items:
SearchStax engineers have helped our clients import data from different sources into Apache Solr. If you’re interested in learning more about our hosted Solr solution and support, please reach out.
The Stack is delivered bi-monthly with industry trends, insights, products and more
Copyrights © SearchStax Inc.2014-2024. All Rights Reserved.
SearchStax Site Search solution is engineered to give marketers the agility they need to optimize site search outcomes. Get full visibility into search analytics and make real-time changes with one click.
close
SearchStax Managed Search service automates, manages and scales hosted Solr infrastructure in public or private clouds. Free up developers for value-added tasks and reduce costs with fewer incidents.
close
close