多模型作文評分
AI ↔ AI三個 LLM 並行評分,彙整分數,進行偏見檢查,再發出最終成績。
7 個節點 · 9 條連接education
agentsystem
視覺化
作文接收system
接收提交的作文及評分標準,對學生身份進行匿名化處理。
↓parallel→ 評分者 A(GPT-4o)
↓parallel→ 評分者 B(Claude)
↓parallel→ 評分者 C(Gemini)
評分者 A(GPT-4o)agent
依結構、論點品質、證據運用與寫作清晰度評分。
↓parallel→ 分數彙整
評分者 B(Claude)agent
使用相同評分標準獨立評分,對其他評分者保持盲評。
↓parallel→ 分數彙整
評分者 C(Gemini)agent
作為第三位評分者獨立評分,以形成穩健的共識。
↓parallel→ 分數彙整
分數彙整system
計算加權平均分數,若任一評分者偏差超過平均值 15% 則標記。
↓sequential→ 偏見偵測代理人
偏見偵測代理人agent
分析評分模式是否存在人口統計偏見、主題偏見或長度偏見。
↓conditional→ 最終成績與回饋
↓fallback→ 評分者 A(GPT-4o)
最終成績與回饋agent
發出最終成績,附上所有評分者的綜合回饋與改善建議。
uc-multi-model-grading.osop.yaml
osop_version: "1.0"
id: "multi-model-grading"
name:"多模型作文評分"
description:"三個 LLM 並行評分,彙整分數,進行偏見檢查,再發出最終成績。"
nodes:
- id: "essay_intake"
type: "system"
name: "作文接收"
description: "接收提交的作文及評分標準,對學生身份進行匿名化處理。"
- id: "grader_1"
type: "agent"
subtype: "llm"
name: "評分者 A(GPT-4o)"
description: "依結構、論點品質、證據運用與寫作清晰度評分。"
- id: "grader_2"
type: "agent"
subtype: "llm"
name: "評分者 B(Claude)"
description: "使用相同評分標準獨立評分,對其他評分者保持盲評。"
- id: "grader_3"
type: "agent"
subtype: "llm"
name: "評分者 C(Gemini)"
description: "作為第三位評分者獨立評分,以形成穩健的共識。"
- id: "aggregate"
type: "system"
name: "分數彙整"
description: "計算加權平均分數,若任一評分者偏差超過平均值 15% 則標記。"
- id: "bias_check"
type: "agent"
subtype: "llm"
name: "偏見偵測代理人"
description: "分析評分模式是否存在人口統計偏見、主題偏見或長度偏見。"
- id: "final_grade"
type: "agent"
subtype: "llm"
name: "最終成績與回饋"
description: "發出最終成績,附上所有評分者的綜合回饋與改善建議。"
edges:
- from: "essay_intake"
to: "grader_1"
mode: "parallel"
- from: "essay_intake"
to: "grader_2"
mode: "parallel"
- from: "essay_intake"
to: "grader_3"
mode: "parallel"
- from: "grader_1"
to: "aggregate"
mode: "parallel"
- from: "grader_2"
to: "aggregate"
mode: "parallel"
- from: "grader_3"
to: "aggregate"
mode: "parallel"
- from: "aggregate"
to: "bias_check"
mode: "sequential"
- from: "bias_check"
to: "final_grade"
mode: "conditional"
when: "bias.detected == false"
- from: "bias_check"
to: "grader_1"
mode: "fallback"
label: "Bias detected, re-grade with adjusted prompts"