AutoGen

은 협업을 대화로 본다. 가 직무 기술서라면, 이쪽은 회의록이다. 메시지를 주고받으며 합의에 도달한다는 발상이 시작점이다.

v0.3에서 v0.4로

2024년 말 가 출시되며 메이저 리라이트가 단행됐다. 핵심은 비동기 액터 모델. 모든 가 메일박스를 가진 액터가 되고, 메시지는 큐로 흐른다. 기존 v0.2 동기 호출 모델과는 호환되지 않는다.

v0.4는 v0.2의 후속이 아니라 다시 짠 라이브러리다.

패키지도 갈렸다. 코어는 autogen-core, 멀티 대화 추상은 autogen-agentchat, 외부 어댑터는 autogen-ext로 분리됐다.

액터 모델 한눈에

각 는 독립 비동기 액터다. 를 받아 처리하고, 다른 액터에 메시지를 보낸다. 호출자가 응답까지 막혀 있지 않는다는 점이 동기 모델과의 결정적 차이다. 이 기본이다.

다이어그램 로딩…

AssistantAgent — LLM 워커

는 LLM 호출 전담 다. name, model_client, tools, system_message가 필수에 가까운 인자다. 은 비동기 함수 리스트로 넘기고, reflect_on_tool_use=True로 을 켤 수 있다.

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def web_search(query: str) -> str:
  """가짜 웹 검색."""
  return f"results for {query}"

model_client = OpenAIChatCompletionClient(model="gpt-4o")

researcher = AssistantAgent(
  name="researcher",
  model_client=model_client,
  tools=[web_search],
  system_message="너는 시니어 리서처다. 출처 없는 주장은 하지 않는다.",
  reflect_on_tool_use=True,
)

// TypeScript에는 공식 AutoGen v0.4 SDK가 없다. 액터-메시지 패턴을 직접 구현한다.
type Message = { from: string; to?: string; content: string }

export abstract class Agent {
constructor(public name: string) {}
abstract handle(msg: Message): Promise<Message[]>
}

export class AssistantAgent extends Agent {
constructor(
  name: string,
  private systemMessage: string,
  private llm: (msgs: Message[]) => Promise<string>,
  private tools: Record<string, (args: any) => Promise<string>> = {},
) { super(name) }

async handle(msg: Message): Promise<Message[]> {
  const reply = await this.llm([
    { from: 'system', content: this.systemMessage },
    msg,
  ])
  return [{ from: this.name, content: reply }]
}
}

UserProxyAgent — 사람·실행기 대역

는 두 역할을 동시에 한다. 사람을 대신해 다음 입력을 받거나, 코드 실행기를 끼워 다른 가 생성한 코드를 샌드박스에서 돌린다. 를 붙이면 자동 디버깅 루프가 가능해진다.

from autogen_agentchat.agents import UserProxyAgent
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor

# 사람 입력을 받는 모드
human = UserProxyAgent(
  name="human",
  input_func=input,  # CLI 입력
)

# 코드 실행기 모드 — 다른 에이전트의 코드 블록을 자동 실행
executor = UserProxyAgent(
  name="executor",
  code_executor=DockerCommandLineCodeExecutor(work_dir="/tmp/run"),
)

export class UserProxyAgent extends Agent {
constructor(
  name: string,
  private mode: 'human' | 'executor',
  private readInput?: () => Promise<string>,
  private exec?: (code: string) => Promise<string>,
) { super(name) }

async handle(msg: Message): Promise<Message[]> {
  if (this.mode === 'human' && this.readInput) {
    const text = await this.readInput()
    return [{ from: this.name, content: text }]
  }
  if (this.mode === 'executor' && this.exec) {
    const code = extractCodeBlock(msg.content)
    const out = code ? await this.exec(code) : '(no code)'
    return [{ from: this.name, content: out }]
  }
  return []
}
}
function extractCodeBlock(s: string): string | null {
const m = s.match(/```(?:python)?\n([\s\S]*?)\n```/)
return m?.[1] ?? null
}

GroupChat — 회의 컨테이너

은 여러 를 모은 대화 무대다. v0.4에서는 autogen_agentchat.teams 모듈의 RoundRobinGroupChat과 SelectorGroupChat이 표준이다. 전자는 발화 순서가 라운드로빈으로 고정, 후자는 매 턴 LLM이 다음 발화자를 고른다. 어느 쪽이든 의 역할을 내장한다.

from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination

# 종료 조건 — 누군가 "APPROVE"를 말하면 끝
termination = TextMentionTermination("APPROVE")

team = RoundRobinGroupChat(
  [researcher, critic],
  termination_condition=termination,
)

result = await team.run(task="A2A 프로토콜에 대해 한 단락 정리해 보자.")
print(result.messages[-1].content)

type Termination = (history: Message[]) => boolean

export class RoundRobinGroupChat {
constructor(
  private agents: Agent[],
  private terminate: Termination,
) {}

async run(task: string): Promise<Message[]> {
  const history: Message[] = [{ from: 'user', content: task }]
  let i = 0
  while (!this.terminate(history)) {
    const speaker = this.agents[i % this.agents.length]
    const replies = await speaker.handle(history[history.length - 1])
    history.push(...replies)
    i += 1
  }
  return history
}
}

const textMentionTermination = (needle: string): Termination =>
(h) => h.some((m) => m.content.includes(needle))

ChatManager — 발화자 선택

는 GroupChat에서 다음에 누가 말할지 정한다. v0.4의 이 이 역할을 LLM 기반으로 한다.

기본 셀렉터는 “대화 히스토리 + 후보 목록 + 각 후보의 description”을 모델에 보여 주고 다음 발화자 이름을 받는다. 각자에게 또렷한 description을 달아 두는 게 정확도의 1등 변수다.

from autogen_agentchat.teams import SelectorGroupChat

selector_prompt = """다음 대화에서 가장 어울리는 다음 발화자를 골라라.
{roles}
{history}
지금 차례를 가질 한 명의 이름만 출력."""

team = SelectorGroupChat(
  [researcher, writer, critic],
  model_client=model_client,
  selector_prompt=selector_prompt,
  termination_condition=termination,
)

// LLM 기반 셀렉터의 미니 구현
export class SelectorGroupChat {
constructor(
  private agents: Agent[],
  private selectNext: (history: Message[], roster: Agent[]) => Promise<Agent>,
  private terminate: Termination,
) {}

async run(task: string): Promise<Message[]> {
  const history: Message[] = [{ from: 'user', content: task }]
  while (!this.terminate(history)) {
    const speaker = await this.selectNext(history, this.agents)
    const replies = await speaker.handle(history[history.length - 1])
    history.push(...replies)
  }
  return history
}
}

종료 조건

대화는 끝나야 한다. 는 종료 조건을 객체로 다룬다. TextMentionTermination은 특정 문자열, MaxMessageTermination은 개수 상한, ExternalTermination은 외부 트리거다. | 연산자로 OR 합성된다. 은 이 조건을 매 발화 직후 평가한다.

from autogen_agentchat.conditions import (
  MaxMessageTermination,
  TextMentionTermination,
  ExternalTermination,
)

stop = (
  TextMentionTermination("APPROVE")
  | MaxMessageTermination(max_messages=12)
)

ext = ExternalTermination()
team = RoundRobinGroupChat([a, b], termination_condition=stop | ext)
# 어디서든 ext.set() 호출하면 즉시 중단

const maxMessages = (n: number): Termination => (h) => h.length >= n
const orT = (...ts: Termination[]): Termination =>
(h) => ts.some((t) => t(h))

const stop = orT(
textMentionTermination('APPROVE'),
maxMessages(12),
)

코드 실행 루프

가 파이썬 코드를 작성하면 가 실행, 출력이 다시 어시스턴트로 흘러간다. 이 두 액터의 핑퐁이 의 시그니처 패턴이다. 가 자기 결과를 관찰하면서 답을 정제하는 ReAct의 다른 모습이다.

다이어그램 로딩…

run vs run_stream

team.run(task=...)은 끝까지 기다렸다가 결과를 한 번에 돌려준다. team.run_stream(...)은 단위로 이터레이터를 돌려준다. UI에 그대로 붙는다. 을 reset()으로 비우거나, 인자 없이 run()을 다시 불러 직전 대화를 이어 갈 수도 있다.

# 끝까지 한 번에
result = await team.run(task="요약해 줘")

# 메시지 단위 스트림
async for msg in team.run_stream(task="요약해 줘"):
  print(msg.source, ":", msg.content)

# 중단했다가 이어 가기
await team.reset()                # 상태 초기화
await team.run(task="다른 주제")  # 처음부터
# 또는 task 없이 호출 → 직전 대화에 이어 발화

// 두 API를 같이 노출
export class Team {
constructor(public chat: RoundRobinGroupChat | SelectorGroupChat) {}
run(task: string) { return this.chat.run(task) }
async *runStream(task: string) {
  for (const m of await this.chat.run(task)) yield m
}
}

도구·코드 함께 묶기

가장 자주 보이는 구성은 “어시스턴트 + 코드 실행기 + 검수자”의 3액터다. 가 답을 짜고, 가 검증 코드를 돌리고, 검수자가 결과를 채점한다. 를 에 끼우면 자체 점검 루프가 완성된다.

CrewAI·LangGraph와 비교

같은 멀티 문제를 세 프레임워크가 다르게 본다.

프레임워크	일차 추상	흐름 결정
	그래프·노드·엣지	정적/조건부 엣지
	역할·태스크	sequential / hierarchical
	액터·대화	가 발화자 선택

자유로운 대화가 본질이면 AutoGen, 명시적 분기가 중요하면 LangGraph, 직무 카드가 가장 깔끔한 도메인이면 CrewAI. 21장에서 더 자세히 비교한다.

다음 장으로

프레임워크 셋을 봤으니, 이번엔 프레임워크 없이 + 메시지 큐로 직접 짜본다. 20장은 같은 시스템을 바닥부터 조립하며, 의 실체를 코드로 확인한다.