What is Indexing?

Indexing is the process of arranging the data in a more systematic and efficient way to locate the information in a document in a much faster way.


Let’s understand Indexing with the below examples


Example 1

Assume that I have the following table in the DB which stores Person’s details

Person Table

Solr table data before indexing

If we want to fetch the records whose last name is Dravid then It scans each and every row to match for the last name as Dravid , if matches it will add that to the result set.

This requires to go through each and every row even though that row will have a different lastname than Dravid.

Don’t you think it takes more time as it has to scan though unwanted rows ?

Yes it will take more time for sure as it is going through each and every row.

Now Observe the below data

Solr table after indexing

Here we have arranged the table data in the order of last name in alphabetical order

Now when we search for a last name as Dravid we can identify the right row based on the alphabetical order and then get the result accordingly.

Here it is not required to go through all the rows to search Dravid because we know that LastName is arranged in alphabetical order.

It has improved the performance.

If we have 100000 rows, then it improves the performance drastically while searching.


Example 2

Another example would be Text Books which has index in it.

Assume we want to search the Chapter called Brave Man and if we don’t have index defined at the beginning of the book then it is very difficult to search that chapter and also it takes more time.

If we have index defined like below

ChapterName :page number

Then we can easily search the Chapter name and get the page number from the index using which we can easily open the chapter in the Book without much time.

So in both the cases, with Indexing we are increasing the performance of Searching.

This is what actually needed by many websites especially ecommerce sites where people do searching a lot.

This way of representing the data in a more efficient way to make the search faster is called Indexing

There are many frameworks available in the market which helps to achieve indexing and also provides lot more features along with indexing like faceted navigation,Hit Highlighting,caching etc.

Some of such frameworks available in the market are Solr,Sphinx,elastic search,Algolia ,Swiftype etc.

Each framework will have their own way of indexing the data published to it.

Solr is the most widely used open source Search server with a current run-rate of over 6,000 downloads a day and installed at 4000 companies as per the Solr wiki statistics at the time of posting it

check the below link to read the same

https://cwiki.apache.org/confluence/display/solr/Getting+Started

About the Author

Karibasappa G C (KB)
Founder of javainsimpleway.com
I love Java and open source technologies and very much passionate about software development.
I like to share my knowledge with others especially on technology 🙂
I have given all the examples as simple as possible to understand for the beginners.
All the code posted on my blog is developed,compiled and tested in my development environment.
If you find any mistakes or bugs, Please drop an email to kb.knowledge.sharing@gmail.com

Connect with me on Facebook for more updates

Share this article on