Step 1: Building the Foundation for Smarter Science Conversations
Welcome to the first step of a journey that will empower you to see through clickbait headlines and truly understand the science that shapes our world. In this post, we’ll lay the groundwork for creating a tool that decodes scientific studies and helps us navigate misinformation.
The project is ambitious. That's the point. Today we’ll start small with a few bit size steps. You’ll have taken the first major steps toward building a Retrieval-Augmented Generation (RAG) tool by the end. The goal is to make this blog series easy for those new to coding.
Did you miss Part 1 of this blog series?
Read Part 1 HERE.
What We’re Doing in Step 1
Every great project starts with a solid foundation. Today, we’ll:
- Set Up the Development Environment: Get the tools you need to write and test code.
- Install the Basics: Prepare to use Python, the programming language we’ll use to power our tool.
- Understand the Big Picture: Learn how Hugging Face’s SciBERT, FAISS, and MongoDB connect and work together.
You’ll be ready to dive into the next steps with confidence by the end of each step of the series.
Meet Fit T. Cent. A RAG I built help me on my fitness journey.
Why This Step Matters
Think of this as building the frame for a house. If the foundation isn’t solid, everything else falls apart. In our case, this means making sure your computer is ready to handle the coding, data processing, and AI magic that will follow.
Step 1: Set Up Your Development Environment
The development environment is where you write and test your programs.
-
Install Python
- Go to python.org and download the latest version of Python (it’s free!).
- Follow the installation instructions for your computer.
-
Install VS Code (Visual Studio Code)
- Download and install VS Code from code.visualstudio.com.
- This is the tool for writing, editing, and running your code.
-
Set Up Git (Optional)
- Git is a version control tool that helps you save your progress as you code and test out different versions of solving a problem. It's like time traveling for your work.
- Visit git-scm.com to download it.
- Bonus: If you’re new to Git, platforms like GitHub can help you share your work with others.
- Checkout the GitHub for Beginners YouTube Playlist.
Step 2: Installing Python Libraries
Libraries are pre-written pieces of code that make your life easier. We’ll install a few that are key to this project:
- Open your terminal (Command Prompt or Terminal app).
-
Type the following commands to install the libraries:
pip install fastapi pip install uvicorn pip install transformers pip install faiss-cpu pip install pymongo pip install beautifulsoup4
These libraries will let us:
- Build a web app (FastAPI + Uvicorn).
- Process scientific language (Transformers + Hugging Face).
- Use FAISS for fast data searches.
- Store data with MongoDB.
- Scrape content from websites (BeautifulSoup).
Step 3: Big Picture
Let’s take a step back and look at what we're trying to accomplish for a moment. Here’s how the pieces fit together:
- The User Interface (Frontend): This is what users will see and interact with—like a chat window or submission form. We’ll build this using Next.js later.
- The Brain (Backend): This is where the magic happens. Python will process user questions, retrieve relevant information, and return easy-to-understand answers.
- The Database: MongoDB will store scientific studies and user interactions, while FAISS will make searching through that data fast and efficient.
- The AI: Hugging Face’s SciBERT model will turn complex scientific studies into plain language summaries.
Strengths of This Approach
- Beginner-Friendly: You’re using tools designed to help you succeed, even if you’re new to coding.
- Open Source: All the software is free and widely supported by online communities.
- Scalable: This setup can grow as the tool becomes more popular.
Weaknesses to Watch Out For
- Learning Curve: If you’re new to coding, some parts might feel overwhelming. Take it one step at a time!
- Setup Time: Installing tools and libraries takes patience, but it’s worth it.
Celebrate Your Progress!
Congratulations! You’ve just completed the first step in building a tool that will make the world smarter about science. Now you’re ready to move on to writing the backend—the brain of the operation.
Share your progress with friends or on social media. Let them know you’re starting a project that will cut through the noise of clickbait science.
Step 2 is ready. In Step 2 we build the backend and connect it to the AI that powers the tool. Click HERE to read Step 2.
Excited about what’s next? Follow this blog for updates or reach out to discuss how this project could inspire smarter solutions for your organization.