Challenge

Our client, a leading publisher needed to make their data easily accessible to users via an easy-to-use search function. For this, they needed a robust, self-learning information retrieval platform which would ingest over 300 data sources and make it available in a user-friendly manner.

Solution

Infosys leveraged its AI platform NIA, to develop a stable architecture design and implemented an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) compliant data harvester. This enabled the ingestion of data at scale using a variety of open source and AWS tools. Search functionality was enhanced to include advanced NLP and machine learning techniques, along with automated ‘type-ahead’ features.

Outcomes

Spark-based ETL framework ingested data containers in just 2.8 seconds

Systematic approach to search quality assessments

Search functionality improved through NLP and machine learning

Awarded Best Search Project by Information Retrieval Specialists Group

Find out more about how we can help you manage and access your data better.