PHASE 5 AI Integration: The "Just Connect an API" That Took 700+ Lines of Code

Well, this is the second-to-last part of my MVP. In this, I just integrated the Groq API for now.

API? Should it be easy then?

No, not quite. At first, I thought: just connect the API, send a prompt, get a response, and boom , done! I was so naive. 😅

Chapter 1: The Innocent Beginning

I started simple. Install the Groq Python SDK, grab my API key, write a basic function:

from groq import Groq

client = Groq(api_key=GROQ_API_KEY)
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Tell me about NUST"}],
    model="llama-3.1-8b-instant"
)

It worked! The AI responded! Ship it, right? Wrong.

Chapter 2: The Real-World Wake-Up Call

Then reality hit me in the face:

Problem #1: Students ask about specific universities, programs, and fees. The AI was making things up (hallucinating) because it didn't have access to my actual university database.

Problem #2: Users expected real-time streaming responses like ChatGPT, not waiting 10 seconds for a full response.

Problem #3: The AI had no memory. Ask about "NUST admission" in one message, then "What about the fee?" — and it forgot what university we were talking about.

Problem #4: Sometimes the AI would try to "pretend" it was calling functions by writing <function=search_universities>{...}</function> in the text instead of using actual tool calls!

Chapter 3: Building the Beast (700 Lines Later)

So I rolled up my sleeves and built a custom AI agent system from scratch. No LangChain, no shortcuts — pure engineering:

🔧 Function Calling (Tool Use)

Gave the AI two superpowers:

search_universities() — Query my Supabase database for real Pakistani university data
brave_search() — Search the web when local data isn't enough

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_universities",
            "description": "Search Pakistani universities by location, program, or fees",
            "parameters": UniversitySearchArgs.model_json_schema()
        }
    }
]

Now when a student asks "Which universities in Islamabad offer CS?", the AI actually queries my database instead of guessing!

⚡ Server-Sent Events (SSE) Streaming

Built a custom streaming system so users see responses word-by-word in real-time:

async def stream_generator():
    for chunk in response:
        yield f"data: {chunk.choices[0].delta.content}\n\n"

return StreamingResponse(stream_generator(), media_type="text/event-stream")

It felt like magic the first time I saw text appearing live on the frontend! ✨

🧠 Conversation Memory

Implemented a smart memory system:

Store every message in Supabase with conversation_id
Extract important context (student preferences, mentioned universities)
Summarize old conversations automatically
Load relevant context before each new message

Now the AI remembers: "Oh, you asked about engineering earlier!"

🚨 The Hallucination Interceptor

This was the wildest bug I encountered. Sometimes Llama would write:

<function=search_universities>{"location":"Lahore"}</function>
Let me search for you...

Instead of actually calling the function! So I built a regex-based "hallucination detector":

if "<function" in content_text:
    matches = re.findall(r"<function=([^>]+)>(.*?)</function>", content_text)
    # Convert fake function calls into real tool calls
    mock_tool_calls = [...]
    # Strip the hallucination from the response
    cleaned_content = re.sub(pattern, "", content_text)

Crisis averted! 🛡️

🔄 Message Serialization

Groq's API is picky about message formats. One wrong field (like a Pydantic internal field leaking in) and boom — 400 Bad Request. Built a custom serializer to whitelist only valid fields:

def serialize_messages_for_groq(messages):
    # Strip Pydantic internals, validate required fields,
    # ensure tool_calls have 'type', etc.

Chapter 4: The Aha Moment

After 2 weeks of debugging, refactoring, and head-scratching, it finally clicked together:

User asks: "Tell me about COMSATS Islamabad"
AI decides: "I need to search the database"
Tool call triggered → Queries Supabase → Returns real data
AI gets results: "COMSATS Islamabad has 15 programs, fees range from..."
Streams response word-by-word to frontend
Saves conversation + updates memory
Next message: AI remembers we're talking about COMSATS!

The system was alive. 🤖💚

The Stack Breakdown:

Component	Technology	Lines of Code
Core AI Client	Groq SDK (Llama 3.3 70B)	~54
Streaming Endpoint	FastAPI + SSE	~450
Conversation Memory	Supabase + Custom Logic	~150
Tool Definitions	JSON Schema	~35
Utilities	Custom Helpers	~56
Total	Pure Python	~710

What I Learned:

"Just integrate an API" is a lie. Production AI systems need tools, memory, error handling, streaming, and hallucination detection.
LangChain wasn't needed. I almost added it, but building from scratch gave me way more control and understanding.
Groq is insanely fast (800+ tokens/sec), but you still need smart architecture to make it feel instant.
LLMs will surprise you — in both good (clever reasoning) and bad (creative hallucinations) ways.

The Result:

Students now have an AI counselor that:

✅ Knows real Pakistani university data
✅ Streams responses in real-time
✅ Remembers conversation context
✅ Can search the web for additional info
✅ Doesn't make up fake statistics

And all from what started as: "I'll just call an API..." 😄

Next Up: The final piece of the puzzle, Frontend integration and deployment. Stay tuned!

Stats:

Model: Llama 3.3 70B via Groq
Response Speed: 800-1000 tokens/sec
Architecture: Custom RAG + Function Calling
Dependencies Removed: LangChain (didn't need it!)
Coffee Consumed: Too much ☕

What's your experience building with AI APIs? Did you also think it would be "just 5 lines of code"? Drop your stories in the comments! 👇

PHASE 5 AI Integration: The "Just Connect an API" That Took 700+ Lines of Code

Chapter 1: The Innocent Beginning

Chapter 2: The Real-World Wake-Up Call

Chapter 3: Building the Beast (700 Lines Later)

🔧 Function Calling (Tool Use)

⚡ Server-Sent Events (SSE) Streaming

🧠 Conversation Memory

🚨 The Hallucination Interceptor

🔄 Message Serialization

Chapter 4: The Aha Moment

The Stack Breakdown:

What I Learned:

The Result:

Stats:

Comments

AI Tutor for Gilgit Baltistan

Part 1: Building an AI Tutor for Gilgit Baltistan – Mission & Architecture

More from this blog

Phase 3 & 4: AI Tutor - Done

Phase 2 Done - AI Tutor Web APP

Part 1: Building an AI Tutor for Gilgit Baltistan – Mission & Architecture

Command Palette

Chapter 1: The Innocent Beginning

Chapter 2: The Real-World Wake-Up Call

Chapter 3: Building the Beast (700 Lines Later)

🔧 Function Calling (Tool Use)

⚡ Server-Sent Events (SSE) Streaming

🧠 Conversation Memory

🚨 The Hallucination Interceptor

🔄 Message Serialization

Chapter 4: The Aha Moment

The Stack Breakdown:

What I Learned:

The Result:

Stats:

Comments

AI Tutor for Gilgit Baltistan

Part 1: Building an AI Tutor for Gilgit Baltistan – Mission & Architecture

More from this blog