Case Study – Retail Product Inventory Search
A large global retail company has millions of products and product attributes. This data is processed at a peak throughput required to provide the needed data to merchandising staff to track and manage the business dynamically. This client’s challenge was to organize and make sense of the volumes of product data. Centizen proposed to create a scalable and highly available search solution on huge volumes of product data. Our solution would allow the client to access he product data they need immediately without having to wait. Having immediate access to updated product information is the key that would enable the client to take immediate action and react to the market quicker.
Prior to Big Data technologies, this client used a home-grown data appliance. To keep the data up-to-date, the client’s team had to constantly build and rebuild indexes in the data appliance, which was time consuming and expensive. Additionally, the client’s plans to expand operations to reach their $50 billion revenue goal made this even more challenging. The client needed a way to scale in terms of the volume of and speed of updating data.
In this solution, Centizen made the recommendation to use HBASE as the primary data store using Elasticsearch as the indexing engine that would avoid loss of data. Upon client acceptance, the consulting team loaded the client’s structured and unstructured data into their Hadoop data store. Then from the Hadoop data store, the data was loaded into a NoSQL data store in HBase. The team then enabled full text search in HBase.
Outcome / Business Value
Our solution helped the client to ingest the source data feed fast. The scalable and real time indexing of modified data provided flexibility and ease of search on huge volumes of data. Product data indexing therefore was reduced from hours to near real time (NRT). It provides our client with access to NRT product data and helps them to respond quickly to customer needs more effectively. In the process, the client now manages the cost more effectively to both meet operational growth goals and cut their total cost of operations by 45% annually. This client is running their big data cluster on inexpensive commodity hardware in AWS cloud, and they scale by just adding more inexpensive nodes in AWS cloud. It is seamless and quick.