프레임워크 비교 + 선택 가이드

, , , 그리고 . 네 갈래의 길 앞에서 어떤 기준으로 골라야 하는가.

비교의 함정

“X가 최고다” 같은 평가는 무용지물이다. 같은 도구라도 위 축이 달라지면 결론이 뒤집힌다. 빠른 PoC인지, 장기 운영인지, 을 견딜 수 있는지 — 는 상황별로 다르다.

프레임워크는 답이 아니라 가설이다. 검증할 가설을 먼저 정해라.

이 챕터는 다섯 가지 축으로 네 프레임워크를 동일한 작업에 부딪쳐 본다.

같은 작업, 네 가지 구현

비교의 공정성을 위해 “두 에이전트가 협업해 글을 다듬는” 동일 작업을 네 가지로 짠다. 는 그래프, 는 역할, 은 대화, 은 함수다.

다이어그램 로딩…

추상화 층위가 다르기에 코드 분량과 도 함께 비교된다.

LangGraph — 그래프 기반

상태 머신을 직접 그린다. 노드와 엣지가 명시적이라 이 가파르지만, 의 디버깅은 쉽다. 도 가장 높은 편이다.

# Verified against: https://langchain-ai.github.io/langgraph/
# Verified at: 2026-06-02
from typing import TypedDict
from langgraph.graph import StateGraph, START, END

class S(TypedDict):
  draft: str
  review: str

def writer(s: S) -> S:
  return {"draft": "초안: " + s["draft"], "review": ""}

def editor(s: S) -> S:
  return {"draft": s["draft"], "review": "교정 완료"}

g = StateGraph(S)
g.add_node("writer", writer)
g.add_node("editor", editor)
g.add_edge(START, "writer")
g.add_edge("writer", "editor")
g.add_edge("editor", END)
app = g.compile()
print(app.invoke({"draft": "에이전트는 도구를 쓴다"}))

// Verified against: https://langchain-ai.github.io/langgraphjs/
// Verified at: 2026-06-02
import { StateGraph, START, END, Annotation } from '@langchain/langgraph'

const S = Annotation.Root({
draft: Annotation<string>(),
review: Annotation<string>(),
})

const g = new StateGraph(S)
.addNode('writer', (s) => ({ draft: '초안: ' + s.draft, review: '' }))
.addNode('editor', (s) => ({ draft: s.draft, review: '교정 완료' }))
.addEdge(START, 'writer')
.addEdge('writer', 'editor')
.addEdge('editor', END)

const app = g.compile()
console.log(await app.invoke({ draft: '에이전트는 도구를 쓴다', review: '' }))

CrewAI — 역할 기반

“누가 무엇을” 위주의 사고를 그대로 코드로 옮긴다. 의 초기 진입은 가볍지만, 분기 제어는 약하다. 이 완만한 만큼 도 다른 위치에서 나타난다.

# Verified against: https://docs.crewai.com/concepts/crews
# Verified at: 2026-06-02
from crewai import Agent, Task, Crew

writer = Agent(role="작가", goal="초안 작성", backstory="문장가")
editor = Agent(role="편집자", goal="교정", backstory="냉정한 데스크")

t1 = Task(description="짧은 글 초안", agent=writer, expected_output="초안")
t2 = Task(description="문법·논리 교정", agent=editor, expected_output="교정본")

crew = Crew(agents=[writer, editor], tasks=[t1, t2])
print(crew.kickoff())

// Verified against: https://docs.crewai.com/
// Verified at: 2026-06-02
// CrewAI는 Python이 일급 — TS는 HTTP 호출로 대체.
const res = await fetch('http://localhost:8000/kickoff', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
  crew: 'writer-editor',
  inputs: { topic: '에이전트 입문' },
}),
})
console.log(await res.json())

AutoGen — 대화 기반

은 에이전트끼리 메시지를 주고받는다. 흐름이 자연스럽지만, 컨텍스트가 빠르게 부푼다. 는 토큰 비용으로 돌아온다.

# Verified against: https://microsoft.github.io/autogen/stable/
# Verified at: 2026-06-02
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main():
  m = OpenAIChatCompletionClient(model="gpt-4o-mini")
  w = AssistantAgent("writer", model_client=m, system_message="초안 작성자")
  e = AssistantAgent("editor", model_client=m, system_message="편집자")
  team = RoundRobinGroupChat([w, e],
      termination_condition=MaxMessageTermination(4))
  await team.run(task="에이전트 입문 짧은 글")

asyncio.run(main())

// Verified against: https://microsoft.github.io/autogen/
// Verified at: 2026-06-02
// AutoGen 핵심은 Python. Node에서는 보통 REST 게이트웨이로 호출.
const r = await fetch('http://localhost:8001/team/run', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
  team: 'writer-editor',
  task: '에이전트 입문 짧은 글',
  max_messages: 4,
}),
})
console.log(await r.json())

자체 구현 — 함수의 결합

asyncio + 큐 + FastAPI 정도면 충분하다. 은 종속성이 적고, 도 사실상 없다. 를 직접 채워야 하는 비용은 별도다.

# Verified against: 20장 자체 구현
# Verified at: 2026-06-02
import asyncio

async def writer(text: str) -> str:
  await asyncio.sleep(0)
  return "초안: " + text

async def editor(text: str) -> str:
  await asyncio.sleep(0)
  return text + " (교정완료)"

async def pipeline(seed: str) -> str:
  return await editor(await writer(seed))

print(asyncio.run(pipeline("에이전트는 도구를 쓴다")))

// Verified against: 20장 자체 구현
// Verified at: 2026-06-02
const writer = async (t: string) => '초안: ' + t
const editor = async (t: string) => t + ' (교정완료)'

const pipeline = async (seed: string) => editor(await writer(seed))
console.log(await pipeline('에이전트는 도구를 쓴다'))

8축 매트릭스

다섯 축이 아니라 8축으로 본다. 점수는 5점 만점. 의 핵심은 축의 선정 그 자체다. 과 는 시간이 지나면서 가장 빠르게 변하는 항목이다.

축
학습 곡선	2	4	3	4
분기 제어	5	2	3	5
멀티 에이전트	4	5	5	3
관측성	5	3	3	1
생태계 성숙도	5	4	4	1
벤더 종속	3	3	3	5
자체 호스팅	4	4	4	5
비용 가시성	5	3	3	4

축의 가중치는 팀의 우선순위에 따라 다르다.

케이스별 추천

상황을 먼저 정하고 도구를 고른다. 반대 순서는 함정이다. 점수는 가중치 없이 보면 무의미하고, 이 0순위인 팀과 이 0순위인 팀의 정답은 다르다.

이 분류는 정답이 아니라 출발점이다.

의사결정 표 만들기

가중치를 명시적으로 적는다. 그렇지 않으면 “왠지” 라는 단어가 의사결정을 지배한다. 위에 가중치를 곱해 단일 점수로 환원한다. 가 큰 항목은 가중치도 크게 주는 것이 자연스럽다. 는 점수가 변하기 쉬워 분기마다 재평가가 필요하다.

# Verified against: 본 챕터 매트릭스
# Verified at: 2026-06-02
scores = {
  "langgraph":   [2, 5, 4, 5, 5, 3, 4, 5],
  "crewai":      [4, 2, 5, 3, 4, 3, 4, 3],
  "autogen":     [3, 3, 5, 3, 4, 3, 4, 3],
  "diy":         [4, 5, 3, 1, 1, 5, 5, 4],
}
weights = [0.10, 0.20, 0.15, 0.15, 0.10, 0.10, 0.10, 0.10]

for name, s in scores.items():
  total = sum(a * b for a, b in zip(s, weights))
  print(f"{name}: {total:.2f}")

// Verified against: 본 챕터 매트릭스
// Verified at: 2026-06-02
const scores: Record<string, number[]> = {
langgraph: [2, 5, 4, 5, 5, 3, 4, 5],
crewai:    [4, 2, 5, 3, 4, 3, 4, 3],
autogen:   [3, 3, 5, 3, 4, 3, 4, 3],
diy:       [4, 5, 3, 1, 1, 5, 5, 4],
}
const weights = [0.10, 0.20, 0.15, 0.15, 0.10, 0.10, 0.10, 0.10]

for (const [name, s] of Object.entries(scores)) {
const total = s.reduce((acc, v, i) => acc + v * weights[i], 0)
console.log(`${name}: ${total.toFixed(2)}`)
}

벤더 종속의 가격표

은 단순히 “탈출이 어렵다” 의 문제가 아니다. 가격 인상, API 변경, 서비스 중단까지 모두 종속의 비용이다. 은 이 비용을 0으로 두는 대신, 관측성·평가·복구를 직접 짠다.

는 코드 줄 수가 아니라 사후 비용으로 측정한다. 1년 뒤 마이그레이션 비용을 예상해 본다.

결정 흐름

질문 세 개로 좁힌다. 가 너무 무거우면 흐름도가 빠르다. , , 셋이 가장 큰 결정 인자다.