Please go through the solr related articles in Solr articles page before understanding Solr in Hybris.
Most of the eCommerce sites provide search functionality on their sites and especially for searching products details.
Products are the main searching data in any eCommerce site.
Since Hybris is used for developing eCommerce sites,Solr in Hybris is used for making faster search on the products in the site.
Look at the below diagram on how Solr is used in Hybris
Whenever user access any data in the storefront, it can come from either hybris DB or from Solr based on whether that data is indexed or not.
If data is indexed ,it will be stored separately in Solr and can be accessed from there.
If data is not indexed, it will be anyway available in Hybris DB and can be accessed from there.
Communication between Solr and Hybris DB is one way because Solr only gets the data from Hybris DB but it will not write anything back to Hybris DB.
Hybris calls the Cron job for indexing, then Solr gets the source data from Hybris DB and then it does the indexing and save the indexed data within it.
Accessing data from Hybris DB will take more time than accessing it from Solr because of indexed data in Solr, hence Solr is preferred in searching than Hybris DB.
If you still don’t understand why solr is preferred for searching, No problem, please go through these solr articles once again Solr articles
Solr in hybris supports 3 types of indexing strategies
1) Full indexing:
In this strategy, all the existing indexed documents will be deleted first and then fresh indexing will be done from the scratch.
It takes considerable amount of time, so not advised to do it frequently.
Full indexing supports 2 modes of commit
In this mode,if indexing fails then previously committed documents will be available.
b)Two phase mode
In this mode, if indexing fails, everything will be rolled back to initial state.
In this mode,Solr creates one extra core as a temporary core only for indexing, once indexing is success then it will be swapped with original core.
So original core will be safe in case of failure in indexing.
It is called Two phase mode mainly because it has 2 Solr cores involved while indexing.
The initial core is kept as a backup and other one is created as a copy
Indexing will be performed on this copy which will be later swapped with original core if indexing gets success.
2) Update indexing:
In this strategy, only those documents which have been modified within some given time will be indexed and other indexed documents remains as it is.
This operation can be done frequently if needed as it consumes less time compared to Full indexing strategy
3) Delete indexing:
This strategy is used to completely remove the indexed documents.
This should be done periodically to maintain the consistency of indexed data as we might have unwanted indexed data in Solr from a long time.
What can be indexed in Hybris ?
We can index any hybris item type using either HMC or Impex.
As we all know doing through impex is the best way as it lasts long and reusable in all the environments(DEV,TEST,PROD)
We just need to define the Solr configuration in the impex file accordingly.
Indexing for Product item type is already done by Hybris out of the box.
So if we add any new attributes to Product item type and we want those new attributes to be indexed then we need to add those new attributes in the solr impex file.
We can define the queries in solr impex file to get the data from hybris DB for indexing and we also need to define the fields descriptions in the Solr impex file.
Good part with Hybris is that, it has already provided cron jobs for performing full indexing,update indexing and delete indexing.