Proxy search assistant powered by Groq with LangGraph, tool invocation, subagents, and proxy memory: let’s build it

In this tutorial, we build Your puppyPowered research workflow that runs directly with Groq’s free OpenAI-compatible inference endpoint. We configure LangChain’s ChatOpenAI interface to work with Groq by setting the Groq API key and base URL, allowing us to use fast hosted models like llama-3.3-70b-versatile for tool-based inference. We then connect the model to practical tools for web searching, web page fetching, file manipulation, Python implementation, skill loading, sub-agent delegation, and long-term memory. By the end of the tutorial, we have a multi-step Groq-based agent that can search a topic, delegate focused subtasks, generate structured output, and save useful information for subsequent runs.

import subprocess, sys
def _pip(*a): subprocess.check_call([sys.executable,"-m","pip","install","-q",*a])
_pip("langgraphandgt;=0.2.50", "langchainandgt;=0.3.0", "langchain-openaiandgt;=0.2.0",
    "langchain-communityandgt;=0.3.0", "ddgs", "requests", "beautifulsoup4",
    "tiktoken", "pydanticandgt;=2.0")


import os, getpass
if not os.environ.get("GROQ_API_KEY"):
   os.environ["GROQ_API_KEY"] = getpass.getpass("GROQ_API_KEY (free at console.groq.com/keys): ")


os.environ["OPENAI_API_KEY"]  = os.environ["GROQ_API_KEY"]
os.environ["OPENAI_BASE_URL"] = "https://api.groq.com/openai/v1"


MODEL_NAME = "llama-3.3-70b-versatile"


import json, re, io, contextlib, pathlib
from typing import Annotated, TypedDict, Sequence, Literal, List, Dict, Any
from datetime import datetime, timezone
from langchain_openai import ChatOpenAI
from langchain_core.messages import (
   SystemMessage, HumanMessage, AIMessage, ToolMessage, BaseMessage)
from langchain_core.tools import tool
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode

We install the core libraries needed to build Groq-powered proxy workflows, including LangGraph, LangChain, and DuckDuckGo heuristics and support analysis libraries. We securely collect the Groq API key and configure Groq as an OpenAI-compatible endpoint by setting the API key and base URL. We then import all the modules required for messaging, tools, graph creation, writing, file system manipulation, and model initialization.

SANDBOX = pathlib.Path("/content/deerflow_sandbox").resolve()
for sub in ["uploads","workspace","outputs","skills/public","skills/custom","memory"]:
   (SANDBOX/sub).mkdir(parents=True, exist_ok=True)


def _safe(p: str) -andgt; pathlib.Path:
   full = (SANDBOX/p.lstrip("/")).resolve()
   if not str(full).startswith(str(SANDBOX)):
       raise ValueError(f"path escapes sandbox: {p}")
   return full


SKILLS: Dict[str, Dict[str,str]] = {}
def register_skill(name, description, content, location="public"):
   d = SANDBOX/"skills"/location/name; d.mkdir(parents=True, exist_ok=True)
   (d/"SKILL.md").write_text(content)
   SKILLS[name] = {"description": description, "content": content,
                   "path": str(d/"SKILL.md")}


register_skill("research",
   "Conduct multi-source web research on a topic and produce structured notes.",
   """ Research Skill
 Workflow
1. Decompose the question into 3-5 sub-questions.
2. For each sub-question call `web_search` and pick 2 authoritative URLs.
3. `web_fetch` those URLs; extract concrete facts, numbers, dates.
4. Cross-reference for consensus vs. disagreement.
5. Append findings to `workspace/research_notes.md`: claim → evidence → URL.
 Best practices
- Prefer primary sources. Note dates. Never fabricate URLs or numbers.""")


register_skill("report-generation",
   "Synthesize research notes into a polished markdown report in outputs/.",
   """ Report Generation Skill
 Workflow
1. file_read('workspace/research_notes.md').
2. Outline: exec summary, key findings, analysis, conclusion, sources.
3. file_write('outputs/report.md', ...).
 Structure
-  Title
-  Executive Summary  (3-5 sentences)
-  Key Findings       (bullets)
-  Detailed Analysis  (sections)
-  Conclusion
-  Sources            (numbered URL list)""")


register_skill("code-execution",
   "Run Python in the sandbox for computation, data wrangling, charts.",
   """ Code Execution Skill
1. Plan in plain language first.
2. python_exec the code; persistent artifacts go to /outputs/.
3. Verify before quoting results.""")


MEM = SANDBOX/"memory/long_term.json"
if not MEM.exists():
   MEM.write_text(json.dumps({"facts":[],"preferences":{}}, indent=2))
def _load_mem(): return json.loads(MEM.read_text())
def _save_mem(m): MEM.write_text(json.dumps(m, indent=2))

We create a sandboxed project directory in Colab to keep uploads, workspace files, outputs, skills, and memory organized in one manageable place. We define reusable skills for searching, generating reports, and executing code so the agent can discover and follow structured workflows. We also initialize a simple long-term memory JSON file that stores facts and preferences across multiple runs within the same sandbox.

@tool
def list_skills() -andgt; str:
   """List all skills with one-line descriptions. Call this first for complex tasks."""
   return "\n".join(f"- {n}: {s['description']}" for n,s in SKILLS.items())


@tool
def load_skill(name: str) -andgt; str:
   """Load full SKILL.md for `name`. Call before running its workflow."""
   if name not in SKILLS: return f"Unknown. Available: {list(SKILLS)}"
   return SKILLS[name]["content"]


@tool
def web_search(query: str, max_results: int = 5) -andgt; str:
   """Search the web (DuckDuckGo). Returns titles, URLs, snippets."""
   from ddgs import DDGS
   out = []
   try:
       with DDGS() as d:
           for r in d.text(query, max_results=max_results):
               out.append(f"- {r.get('title','')}\n  URL: {r.get('href','')}\n  "
                          f"{(r.get('body') or '')[:220]}")
   except Exception as e:
       return f"search error: {e}"
   return "\n".join(out) or "no results"


@tool
def web_fetch(url: str, max_chars: int = 4000) -andgt; str:
   """Fetch a URL, return cleaned text (scripts/nav stripped)."""
   import requests
   from bs4 import BeautifulSoup
   try:
       r = requests.get(url, timeout=15,
                        headers={"User-Agent":"Mozilla/5.0 DeerFlow-Lite"})
       soup = BeautifulSoup(r.text, "html.parser")
       for s in soup(["script","style","nav","footer","aside","header"]): s.decompose()
       text = re.sub(r"\n\s*\n", "\n\n", soup.get_text("\n")).strip()
       return text[:max_chars] or "(empty page)"
   except Exception as e:
       return f"fetch error: {e}"


@tool
def file_write(path: str, content: str) -andgt; str:
   """Write content to a sandbox path, e.g. 'workspace/notes.md' or 'outputs/x.md'."""
   p = _safe(path); p.parent.mkdir(parents=True, exist_ok=True)
   p.write_text(content)
   return f"wrote {len(content)} chars → {path}"


@tool
def file_read(path: str) -andgt; str:
   """Read a sandbox file (first 8 KB)."""
   p = _safe(path)
   return p.read_text()[:8000] if p.exists() else f"not found: {path}"


@tool
def file_list(path: str = "") -andgt; str:
   """List files under a sandbox dir."""
   base = _safe(path) if path else SANDBOX
   if not base.exists(): return "not found"
   items = []
   for c in sorted(base.rglob("*")):
       if "memory" in c.relative_to(SANDBOX).parts: continue
       items.append(f"  {'D' if c.is_dir() else 'F'}  {c.relative_to(SANDBOX)}")
   return "\n".join(items[:60]) or "(empty)"


@tool
def python_exec(code: str) -andgt; str:
   """Run Python in the sandbox. SANDBOX_ROOT is preset."""
   g = {"__name__":"__sb__", "SANDBOX_ROOT": str(SANDBOX)}
   buf = io.StringIO()
   try:
       with contextlib.redirect_stdout(buf), contextlib.redirect_stderr(buf):
           exec(code, g)
       return (buf.getvalue() or "(no stdout)")[:4000]
   except Exception as e:
       return f"{type(e).__name__}: {e}\n{buf.getvalue()[:1500]}"


@tool
def remember(fact: str) -andgt; str:
   """Persist a single fact to long-term memory (survives across runs)."""
   m = _load_mem()
   m["facts"].append({"fact": fact, "ts": datetime.now(timezone.utc).isoformat()})
   _save_mem(m)
   return f"remembered ({len(m['facts'])} total)"


@tool
def recall() -andgt; str:
   """Retrieve everything in long-term memory."""
   m = _load_mem()
   if not m["facts"]: return "(memory empty)"
   return "\n".join(f"- {f['fact']}" for f in m["facts"][-20:])

We identify the main tools that a Groq-powered agent can call during execution, including menu skills, loading skill instructions, web searching, fetching web pages, reading files, and writing files. We also provide the agent with a sandboxed Python execution environment so it can run calculations or create objects when needed. We add mnemonic tools that allow the agent to remember important facts and retrieve previously stored information.

@tool
def spawn_subagent(role: str, task: str,
                  allowed_tools: str = "web_search,web_fetch,file_write,file_read") -andgt; str:
   """Spawn an isolated sub-agent with a focused role and scoped tools.
   Returns its final report string. Use for parallelizable / focused subtasks."""
   bag = {t.name: t for t in BASE_TOOLS}
   sub_tools = [bag[n.strip()] for n in allowed_tools.split(",") if n.strip() in bag]
   sub_llm = ChatOpenAI(model=MODEL_NAME, temperature=0.2).bind_tools(sub_tools)
   sys_msg = SystemMessage(content=(
       f"You are a specialized sub-agent. Role: {role}.\n"
       f"You operate in an ISOLATED context - no access to lead history.\n"
       f"Tools: {', '.join(t.name for t in sub_tools)}.\n"
       "End with a final assistant message starting 'FINAL REPORT:' "
       "containing a structured ≤700-word summary including any URLs."))
   msgs: List[BaseMessage] = [sys_msg, HumanMessage(content=task)]
   for _ in range(8):
       r = sub_llm.invoke(msgs); msgs.append(r)
       if not getattr(r, "tool_calls", None):
           return f"[sub-agent: {role}]\n" + (r.content if isinstance(r.content,str) else str(r.content))
       for tc in r.tool_calls:
           t = bag.get(tc["name"])
           try:
               res = t.invoke(tc["args"]) if t else f"unknown tool {tc['name']}"
           except Exception as e:
               res = f"tool error: {e}"
           msgs.append(ToolMessage(content=str(res)[:3000], tool_call_id=tc["id"]))
   return f"[sub-agent: {role}] step-limit reached."


BASE_TOOLS = [list_skills, load_skill, web_search, web_fetch, file_write,
             file_read, file_list, python_exec, remember, recall]
ALL_TOOLS = BASE_TOOLS + [spawn_subagent]


LEAD_SYSTEM = f"""You are DeerFlow-Lite, a long-horizon super-agent harness.


Sandbox layout (relative to {SANDBOX}):
 uploads/    - user files
 workspace/  - your scratchpad
 outputs/    - final deliverables
 skills/     - capability modules (load_skill)


Principles:
 • For non-trivial tasks: list_skills → load_skill → execute.
 • Use spawn_subagent for focused subtasks (isolated context keeps lead lean).
 • Persist intermediates to workspace/, deliverables to outputs/.
 • Use remember(fact) for cross-session knowledge.
 • Finish with a short summary of what was produced and where.


Today: {datetime.now(timezone.utc).strftime('%Y-%m-%d')}."""


class AgentState(TypedDict):
   messages: Annotated[Sequence[BaseMessage], add_messages]


llm = ChatOpenAI(model=MODEL_NAME, temperature=0.3).bind_tools(ALL_TOOLS)


def call_model(state: AgentState):
   msgs = list(state["messages"])
   if not msgs or not isinstance(msgs[0], SystemMessage):
       msgs = [SystemMessage(content=LEAD_SYSTEM)] + msgs
   return {"messages": [llm.invoke(msgs)]}


def route(state: AgentState) -andgt; Literal["tools","__end__"]:
   last = state["messages"][-1]
   return "tools" if getattr(last, "tool_calls", None) else END


g = StateGraph(AgentState)
g.add_node("agent", call_model)
g.add_node("tools", ToolNode(ALL_TOOLS))
g.set_entry_point("agent")
g.add_conditional_edges("agent", route, {"tools":"tools", END: END})
g.add_edge("tools", "agent")
APP = g.compile()

We create a subagent tool that allows a Groq-powered master agent to delegate focused tasks to an isolated assistant with a limited set of tools. We then gather all available tools, identify the main administrator, configure the Groq-powered chat form, and connect the tools to it. We finally built a LangGraph workflow so that the agent can switch between thinking and executing the tool until it reaches the final answer.

def run(task: str, max_steps: int = 25):
   print("="*78); print(f"🦌 TASK: {task}"); print("="*78)
   state = {"messages":[HumanMessage(content=task)]}
   n = 0
   for ev in APP.stream(state, {"recursion_limit": max_steps*2}, stream_mode="updates"):
       for node, payload in ev.items():
           for m in payload.get("messages", []):
               n += 1
               if isinstance(m, AIMessage):
                   if m.tool_calls:
                       for tc in m.tool_calls:
                           args = json.dumps(tc["args"], ensure_ascii=False)
                           args = args[:140] + ("..." if len(args)andgt;140 else "")
                           print(f"[{n:02}] 🔧 {tc['name']}({args})")
                   else:
                       txt = m.content if isinstance(m.content,str) else str(m.content)
                       print(f"[{n:02}] 🦌 {txt[:800]}")
               elif isinstance(m, ToolMessage):
                   s = str(m.content).replace("\n"," ")[:220]
                   print(f"[{n:02}] 📤 {s}")
   print("\n"+"="*78); print("✅ COMPLETE - sandbox state:"); print("="*78)
   print(file_list.invoke({"path":""}))
   print("\n🧠 Long-term memory:"); print(recall.invoke({}))
   for f in sorted((SANDBOX/"outputs").rglob("*")):
       if f.is_file():
           print(f"\n--- 📄 {f.relative_to(SANDBOX)} (first 800 chars) ---")
           print(f.read_text()[:800])


run(
   "Give me a briefing on small language models (SLMs) in 2025. "
   "(1) discover skills; (2) spawn a researcher sub-agent to gather "
   "specifics on three notable SLMs from 2024-2025 with sizes, benchmarks, "
   "and use cases - sub-agent saves to workspace/slm_research.md; "
   "(3) load report-generation skill and write outputs/slm_briefing.md "
   "(~400 words) with a Sources section; (4) save the single most "
   "important takeaway to long-term memory; (5) summarize.",
   max_steps=25,
)

We define a run() function that starts the user task, streams each agent step, and prints the tool calls, tool outputs, and final responses in a readable format. We also show the sandbox file structure, long-term memory, and output files generated after the workflow completes. We finish by running a test task in which a Groq-powered agent searches small language models, prepares a summary, saves a report, and stores the key takeaways in memory.

In conclusion, we have created a compact yet capable Groq-based proxy framework that demonstrates how Groq’s OpenAI-compatible API can serve as a fast and accessible backend for advanced LLM workflows. We used LangGraph to manage the agent loop, LangChain to connect the tools to the model hosted in Groq, and custom Python utilities to give the system controlled access to search, files, code execution, and memory. We also showed how isolated sub-agents can help handle focused search tasks while the master agent coordinates the overall workflow. Also, we finished a practical agent system powered by Groq that can be expanded to include research assistants, automated briefing generators, and multi-step AI applications.

verify Complete codes with notebook here. Also, feel free to follow us on twitter And don’t forget to join us 130k+ ml SubReddit And subscribe to Our newsletter. I am waiting! Are you on telegram? Now you can join us on Telegram too.

Do you need to partner with us to promote your GitHub Repo page, face hug page, product release, webinar, etc.? Contact us

Proxy search assistant powered by Groq with LangGraph, tool invocation, subagents, and proxy memory: let’s build it

Like this:

Related

Like this:

Like this:

Like this:

Leave a ReplyCancel reply

ZAILLUSION

News

Legal

Share this:

Like this:

Related

Related Posts

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Leave a ReplyCancel reply