With AI adoption rapidly increasing, businesses and individuals are looking to build smarter search experiences—tools that go beyond simple keyword matching and understand the semantic meaning behind queries. One of the most powerful ways to do this is through Retrieval-Augmented Generation (RAG) using LangChain, a framework that connects LLMs like GPT-4 to external data sources.
In this guide, we’ll walk through how to build a semantic search app from scratch using LangChain, OpenAI, and Chroma (a vector database). This app will allow users to search through a knowledge base of documents and receive contextually relevant, AI-generated answers.
Before diving in, make sure you have the following setup:
.txt
or .md
files to serve as your knowledge basepip install langchain openai chromadb tiktoken unstructured pdfminer.six pytesseract beautifulsoup4
LangChain handles the framework, OpenAI provides the LLM, Chroma acts as your vector database, and Tiktoken is used for efficient text tokenization.
The first step is to load documents into LangChain and split them into manageable chunks. This allows the LLM to handle context efficiently.
LangChain provides loaders for a variety of formats:
from langchain.document_loaders import TextLoader, PDFPlumberLoader, UnstructuredHTMLLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load different types of documents
txt_loader = TextLoader("./data/note.txt")
pdf_loader = PDFPlumberLoader("./data/report.pdf")
html_loader = UnstructuredHTMLLoader("./data/blog.html")
# Combine documents
documents = txt_loader.load() + pdf_loader.load() + html_loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
split_docs = splitter.split_documents(documents)
You can also use UnstructuredImageLoader
with OCR for images, or NotionDBLoader
for Notion databases.
Now that the data is chunked, we’ll convert each piece into a vector embedding using OpenAI’s embedding model.
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
embedding_model = OpenAIEmbeddings()
# Create a persistent Chroma DB
db = Chroma.from_documents(split_docs, embedding_model, persist_directory="db")
db.persist()
This stores your documents as vector embeddings so they can be queried semantically later.
Now let’s create a retrieval-based question answering chain using the embedded data and OpenAI’s language model.
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
retriever = db.as_retriever()
llm = OpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever,
return_source_documents=True
)
query = "What does LangChain do?"
result = qa_chain(query)
print("Answer:", result['result'])
LangChain handles passing the query to the retriever, pulling the most relevant context, and combining it with the LLM for the final answer.
Want to filter by topic, date, or document source? You can add metadata to each document:
from langchain.schema import Document
docs = [
Document(
page_content="LangChain connects LLMs to external tools.",
metadata={"topic": "LangChain", "type": "guide"}
)
]
This allows for filtered retrieval (e.g., only from "guides").
You can integrate your backend with a UI using:
With Streamlit, for example:
import streamlit as st
st.title("Ask your documents anything")
query = st.text_input("Your question")
if query:
answer = qa_chain.run(query)
st.write(answer)
If you're looking to track and debug your LangChain pipelines more effectively, LangSmith is your go-to developer tool. It lets you visualize every step of your chains, monitor performance, and identify edge cases with ease.
Here’s how to quickly integrate LangSmith logging into your existing project:
pip install langsmith
Add the following to your terminal or .env
file:
export LANGCHAIN_API_KEY="your-langsmith-api-key"
export LANGCHAIN_PROJECT="your-project-name"
Option A: One-line global tracing
from langchain.callbacks import tracing_v2_enabled
with tracing_v2_enabled():
result = qa_chain.run("What is LangChain?")
print(result)
Option B: Manual tracer setup
from langchain.callbacks.tracers.langchain import LangChainTracer
tracer = LangChainTracer()
result = qa_chain.run("What is LangChain?", callbacks=[tracer])
That’s it! Your LangChain application is now wired to log runs on LangSmith. This makes debugging, optimizing, and showcasing your chains significantly easier. Complete code for Semantic Search App will look like following now :
from langchain.document_loaders import TextLoader, PDFPlumberLoader, UnstructuredHTMLLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
from langchain.schema import Document
from langchain.callbacks import tracing_v2_enabled
import streamlit as st
# Load different types of documents
txt_loader = TextLoader("./data/note.txt")
pdf_loader = PDFPlumberLoader("./data/report.pdf")
html_loader = UnstructuredHTMLLoader("./data/blog.html")
# Combine documents
documents = txt_loader.load() + pdf_loader.load() + html_loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
split_docs = splitter.split_documents(documents)
# Embed documents
embedding_model = OpenAIEmbeddings()
db = Chroma.from_documents(split_docs, embedding_model, persist_directory="db")
db.persist()
# Set up retrieval chain
retriever = db.as_retriever()
llm = OpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever,
return_source_documents=True
)
# Optional metadata usage
docs = [
Document(
page_content="LangChain connects LLMs to external tools.",
metadata={"topic": "LangChain", "type": "guide"}
)
]
# Streamlit frontend
st.title("Ask your documents anything")
query = st.text_input("Your question")
if query:
with tracing_v2_enabled(): # LangSmith logging
answer = qa_chain.run(query)
st.write(answer)
With just a few lines of code, you've created an AI-powered semantic search system that:
This RAG architecture is the backbone of tools like ChatPDF, AskYourPDF, and enterprise chatbots. Whether you're building an internal tool, customer support bot, or personal assistant, LangChain provides everything you need.
Share what you're building or ask questions in the comments. If you found this helpful, check out our full LangChain series and follow along for more advanced tutorials!