Skip to main content

Command Palette

Search for a command to run...

PHASE 5 AI Integration: The "Just Connect an API" That Took 700+ Lines of Code

Updated
β€’5 min read
R
Deeply "disturbed" by AI/ML, Cloud, and Backendβ€”so I’m writing my way through the chaos. Documenting my journey into MLOps and beyond.

Well, this is the second-to-last part of my MVP. In this, I just integrated the Groq API for now.

API? Should it be easy then?

No, not quite. At first, I thought: just connect the API, send a prompt, get a response, and boom , done! I was so naive. πŸ˜…

Chapter 1: The Innocent Beginning

I started simple. Install the Groq Python SDK, grab my API key, write a basic function:

from groq import Groq

client = Groq(api_key=GROQ_API_KEY)
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Tell me about NUST"}],
    model="llama-3.1-8b-instant"
)

It worked! The AI responded! Ship it, right? Wrong.

Chapter 2: The Real-World Wake-Up Call

Then reality hit me in the face:

Problem #1: Students ask about specific universities, programs, and fees. The AI was making things up (hallucinating) because it didn't have access to my actual university database.

Problem #2: Users expected real-time streaming responses like ChatGPT, not waiting 10 seconds for a full response.

Problem #3: The AI had no memory. Ask about "NUST admission" in one message, then "What about the fee?" β€” and it forgot what university we were talking about.

Problem #4: Sometimes the AI would try to "pretend" it was calling functions by writing <function=search_universities>{...}</function> in the text instead of using actual tool calls!

Chapter 3: Building the Beast (700 Lines Later)

So I rolled up my sleeves and built a custom AI agent system from scratch. No LangChain, no shortcuts β€” pure engineering:

πŸ”§ Function Calling (Tool Use)

Gave the AI two superpowers:

  • search_universities() β€” Query my Supabase database for real Pakistani university data

  • brave_search() β€” Search the web when local data isn't enough

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_universities",
            "description": "Search Pakistani universities by location, program, or fees",
            "parameters": UniversitySearchArgs.model_json_schema()
        }
    }
]

Now when a student asks "Which universities in Islamabad offer CS?", the AI actually queries my database instead of guessing!

⚑ Server-Sent Events (SSE) Streaming

Built a custom streaming system so users see responses word-by-word in real-time:

async def stream_generator():
    for chunk in response:
        yield f"data: {chunk.choices[0].delta.content}\n\n"

return StreamingResponse(stream_generator(), media_type="text/event-stream")

It felt like magic the first time I saw text appearing live on the frontend! ✨

🧠 Conversation Memory

Implemented a smart memory system:

  • Store every message in Supabase with conversation_id

  • Extract important context (student preferences, mentioned universities)

  • Summarize old conversations automatically

  • Load relevant context before each new message

Now the AI remembers: "Oh, you asked about engineering earlier!"

🚨 The Hallucination Interceptor

This was the wildest bug I encountered. Sometimes Llama would write:

<function=search_universities>{"location":"Lahore"}</function>
Let me search for you...

Instead of actually calling the function! So I built a regex-based "hallucination detector":

if "<function" in content_text:
    matches = re.findall(r"<function=([^>]+)>(.*?)</function>", content_text)
    # Convert fake function calls into real tool calls
    mock_tool_calls = [...]
    # Strip the hallucination from the response
    cleaned_content = re.sub(pattern, "", content_text)

Crisis averted! πŸ›‘οΈ

πŸ”„ Message Serialization

Groq's API is picky about message formats. One wrong field (like a Pydantic internal field leaking in) and boom β€” 400 Bad Request. Built a custom serializer to whitelist only valid fields:

def serialize_messages_for_groq(messages):
    # Strip Pydantic internals, validate required fields,
    # ensure tool_calls have 'type', etc.

Chapter 4: The Aha Moment

After 2 weeks of debugging, refactoring, and head-scratching, it finally clicked together:

  1. User asks: "Tell me about COMSATS Islamabad"

  2. AI decides: "I need to search the database"

  3. Tool call triggered β†’ Queries Supabase β†’ Returns real data

  4. AI gets results: "COMSATS Islamabad has 15 programs, fees range from..."

  5. Streams response word-by-word to frontend

  6. Saves conversation + updates memory

  7. Next message: AI remembers we're talking about COMSATS!

The system was alive. πŸ€–πŸ’š

The Stack Breakdown:

Component Technology Lines of Code
Core AI Client Groq SDK (Llama 3.3 70B) ~54
Streaming Endpoint FastAPI + SSE ~450
Conversation Memory Supabase + Custom Logic ~150
Tool Definitions JSON Schema ~35
Utilities Custom Helpers ~56
Total Pure Python ~710

What I Learned:

  1. "Just integrate an API" is a lie. Production AI systems need tools, memory, error handling, streaming, and hallucination detection.

  2. LangChain wasn't needed. I almost added it, but building from scratch gave me way more control and understanding.

  3. Groq is insanely fast (800+ tokens/sec), but you still need smart architecture to make it feel instant.

  4. LLMs will surprise you β€” in both good (clever reasoning) and bad (creative hallucinations) ways.

The Result:

Students now have an AI counselor that:

  • βœ… Knows real Pakistani university data

  • βœ… Streams responses in real-time

  • βœ… Remembers conversation context

  • βœ… Can search the web for additional info

  • βœ… Doesn't make up fake statistics

And all from what started as: "I'll just call an API..." πŸ˜„


Next Up: The final piece of the puzzle, Frontend integration and deployment. Stay tuned!


Stats:

  • Model: Llama 3.3 70B via Groq

  • Response Speed: 800-1000 tokens/sec

  • Architecture: Custom RAG + Function Calling

  • Dependencies Removed: LangChain (didn't need it!)

  • Coffee Consumed: Too much β˜•


What's your experience building with AI APIs? Did you also think it would be "just 5 lines of code"? Drop your stories in the comments! πŸ‘‡

AI Tutor for Gilgit Baltistan

Part 4 of 4

This series documents the journey of building an industry-standard AI Tutor web application designed specifically for students in Gilgit Baltistan. Through the Feel and Support Organization (FSO), my goal is to bridge the educational gap by providing students with personalized university roadmaps, scholarship data, and entry test preparation.

Start from the beginning

Part 1: Building an AI Tutor for Gilgit Baltistan – Mission & Architecture

The Vision: Bridging the Guidance Gap I am from Gilgit Baltistan, a place many call a piece of heaven on Earth. However, beneath the beauty lies a significant challenge: a lack of proper educational g