Indexing XML data


Let’s see how we can index the XML data in Solr


As we know the process of indexing which we have learned while indexing csv file in the Indexing CSV data article

We need to follow 2 steps

1)Define the description of the fields of the new XML data
2)Publish the data to Solr

Let’s use money.xml file provided by solr itself for indexing the XML data.

money.xml is available inside solr-6.2.0\example\exampledocs

Let’s add below fields description in the schema.xml file(solr-6.2.0\server\solr\MyCore\conf) after the < uniqueKey>id tag

1
2
3
4
5
6
7
8
<!-- Fields added for indexing money.xml  file-->
<field name="name" type="text_general" indexed="true" stored="true"/>
<field name="manu" type="text_general" indexed="true" stored="true"/>
<field name="manu_id_s" type="text_general" indexed="true" stored="true"/>
<field name="cat" type="text_general" indexed="true" stored="true"/>
<field name="features" type="text_general" indexed="true" stored="true"/>
<field name="price_c" type="tdouble" indexed="true" stored="true"/>
<field name="inStock" type="boolean" indexed="true" stored="true"/>
<!-- Fields added for indexing money.xml  file-->
<field name="name" type="text_general" indexed="true" stored="true"/>
<field name="manu" type="text_general" indexed="true" stored="true"/>
<field name="manu_id_s" type="text_general" indexed="true" stored="true"/>
<field name="cat" type="text_general" indexed="true" stored="true"/>
<field name="features" type="text_general" indexed="true" stored="true"/>
<field name="price_c" type="tdouble" indexed="true" stored="true"/>
<field name="inStock" type="boolean" indexed="true" stored="true"/>



If we observe the fields in the money.xml file, we can see that 8 fields are available inside this file.

But we have provided 7 fields description in the schema.xml file.

The id field in the money.xml file will be taken care by the uniqueKey element of schema.xml file for indexing
< uniqueKey>id< /uniqueKey>

Now let’s post the data to the Solr to index it

Lets navigate to the below path in command prompt
solr-6.2.0\example\exampledocs

run the below command

1
java -Dtype=text/xml -Durl=http://localhost:8983/solr/MyCore/update -jar post.jar money.xml
java -Dtype=text/xml -Durl=http://localhost:8983/solr/MyCore/update -jar post.jar money.xml

Since it’s a java command, we can pass run time arguments using –D

We are passing 2 java run time arguments here

–Dtype – Specifyies the type of the file like CSV,XML,JSON etc, we are passing XML as our publishing data is of XML type.

-Durl -> URL of the Core under which indexing has to happen

We can see that Solr server has indexed the file and committed the indexed data in MyCore and displayed the following output in command prompt

Access the below url now and check the statistics of indexed data
http://localhost:8983/solr/#/MyCore

Solr books json indexed result in admin console

We can observe that Num Docs displays no of records which are indexed.

Since we have 4 records in money.xml file, all these records are indexed and hence Num Docs displays 4.

Access indexed data
We can access the indexed data directly in the Admin console of Solr without any condition.

Access the below url now
http://localhost:8983/solr/#/MyCore

Select MyCore and click on query option

Now click on execute query

Solr money xml query result

We can see the result which has retrieved all the 4 rows of indexed data.

We can also do search on this indexed data with different query parameters and conditions.

Please check Indexing CSV data article for different ways of accessing the indexed data.

About the Author

Karibasappa G C (KB)
Founder of javainsimpleway.com
I love Java and open source technologies and very much passionate about software development.
I like to share my knowledge with others especially on technology 🙂
I have given all the examples as simple as possible to understand for the beginners.
All the code posted on my blog is developed,compiled and tested in my development environment.
If you find any mistakes or bugs, Please drop an email to kb.knowledge.sharing@gmail.com

Connect with me on Facebook for more updates

Share this article on