Loading

01 Introduction

This RAG Chatbot specializes in the summaries of a large dataset of Physics Scientific Literature pulled from arxiv.org. The web chatbot can answer questions covering the following information from the papers: title, abstract, author(s), publication date, and other more specific details.

  • 560,300 Physics Paper vectorized dataset available for context (with the option to increase to 1.7 million by upgrading the Vector DB)
  • Multi-thread JSON data parsing and vectorizing
  • Fast answers using real-time text streaming from Vercel AI

02 Description

The RAG app works by vectorizing the user text input and using a Euclidean similarity comparison to find relevant pieces of data in the Vector DB. The user input and the relevant context pulled from the DB are then combined and fed into the LLM to provide a response to the user.