Indexing JSON data
Let’s see how we can index the JSON data in Solr
As we know the process of indexing which we have learned while indexing csv file in the Indexing CSV data article
We need to follow 2 steps
1)Define the description of the fields of the new JSON data
2)Publish the data to Solr
Let’s use books.json file provided by solr itself for indexing the JSON data.
books.json is available inside solr-6.2.0\example\exampledocs
Let’s add below fields description in the schema.xml file(solr-6.2.0\server\solr\MyCore\conf) after the < uniqueKey>id tag
- <!-- Fields added for indexing books.json file-->
- <field name="cat" type="text_general" indexed="true" stored="true"/>
- <field name="name" type="text_general" indexed="true" stored="true"/>
- <field name="price" type="tdouble" indexed="true" stored="true"/>
- <field name="inStock" type="boolean" indexed="true" stored="true"/>
- <field name="author" type="text_general" indexed="true" stored="true"/>
<!-- Fields added for indexing books.json file--> <field name="cat" type="text_general" indexed="true" stored="true"/> <field name="name" type="text_general" indexed="true" stored="true"/> <field name="price" type="tdouble" indexed="true" stored="true"/> <field name="inStock" type="boolean" indexed="true" stored="true"/> <field name="author" type="text_general" indexed="true" stored="true"/>
If we observe the fields in the books.json file, we can see that 10 fields are available inside this file.
But we have provided only 5 fields description in the schema.xml file.
What happens to other fields? Will they be indexed?
Yes, the other fields will also be indexed but how?
The id field in the books.json file will be taken care by the uniqueKey element of schema.xml file for indexing
< uniqueKey>id< /uniqueKey>
The other 4 fields will also be indexed using the dynamicField tag in the schema.xml
Now let’s post the data to the Solr to index it
Lets navigate to the below path in command prompt
solr-6.2.0\example\exampledocs
run the below command
- java -Dtype=text/json -Durl=http://localhost:8983/solr/MyCore/update -jar post.jar books.json
java -Dtype=text/json -Durl=http://localhost:8983/solr/MyCore/update -jar post.jar books.json
Since it’s a java command, we can pass run time arguments using –D
We are passing 2 java run time arguments here
–Dtype – Specifyies the type of the file like CSV,XML,JSON etc, we are passing JSON as our publishing data is of JSON type.
-Durl -> URL of the Core under which indexing has to happen
We can see that Solr server has indexed the file and committed the indexed data in MyCore and displayed the following output in command prompt
Access the below url now and check the statistics of indexed data
http://localhost:8983/solr/#/MyCore
We can observe that Num Docs displays no of records which are indexed.
Since we have 4 records in books.json file, all these records are indexed and hence Num Docs displays 4.
Access indexed data
We can access the indexed data directly in the Admin console of Solr without any condition.
Access the below url now
http://localhost:8983/solr/#/MyCore
Select MyCore and click on query option
Now click on execute query
We can see the result which has retrieved all the 4 rows of indexed data.
We can also do search on this indexed data with different query parameters and conditions.
Please check Indexing CSV data article for different ways of accessing the indexed data.