工程可落地的「KG + 向量混合检索架构」完整方案（面向 TSPR / AI推荐系统）。重点不是概念，而是如何真正跑起来 + 如何影响推荐结果（S3/S5）。

一、为什么必须做「KG + 向量混合」

👉 先讲结论：

能力	KG	向量
精准推荐（可控）	✅	❌
语义理解（自然语言）	❌	✅
可解释性	✅	❌
覆盖长尾Query	❌	✅

👉 所以必须融合：

最终推荐能力 = 可解释（KG） × 覆盖能力（Vector）

二、整体架构（生产级）

2.1 架构图（逻辑层）

Query Layer（用户问题）

↓

意图识别（TSPR S1）

↓

双通道检索

↙ ↘

KG检索 向量检索

↓ ↓

结构化结果 语义结果

↓

Fusion融合层（核心）

↓

重排序（TSPR S3）

↓

LLM生成（S5）

三、核心模块拆解（逐个落地）

3.1 Query理解层（入口）

输入：

{

“query”: “best electric toothbrush for college students with braces”

}

输出（结构化）：

{

“user”: “college_student”,

“problem”: “braces”,

“intent”: “recommendation”,

“category”: “electric_toothbrush”

}

👉 技术实现：

LLM + Prompt
或分类模型（轻量）

四、KG检索层（确定性引擎）

4.1 Cypher查询模板

MATCH (p:Product)-[r1:SUITABLE_FOR]->(u:User)
MATCH (p)-[r2:HAS_FEATURE]->(f:Feature)
MATCH (f)-[:SOLVES]->(pr:Problem)WHERE u.type = $user
AND pr.name = $problem

RETURN p,
(r1.confidence * r2.confidence) AS score
ORDER BY score DESC
LIMIT 10

4.2 返回结果

[

{“product”: “K5”, “score”: 0.82},

{“product”: “OralX Pro”, “score”: 0.75}

]

五、向量检索层（语义引擎）

5.1 向量库选择

5.2 向量设计（关键）

向量对象：

Product向量 = 标题 + 描述 + Feature + Review摘要

示例Embedding文本

“K5 electric toothbrush, designed for college students, supports braces cleaning, soft bristles, travel-friendly”

5.3 查询

query_embedding = embed(query)

results = vector_db.search(
embedding=query_embedding,
top_k=10
)

5.4 返回

[

{“product”: “K5”, “score”: 0.91},

{“product”: “SonicPro X”, “score”: 0.87}

]

六、Fusion融合层（最核心）

👉 这里决定系统“聪不聪明”

6.1 融合公式（推荐用）

final_score = α * KG_score + β * Vector_score + γ * Prior

参数建议：

α = 0.5 （可信度）

β = 0.4 （语义匹配）

γ = 0.1 （品牌/商业权重）

6.2 示例

Product	KG	Vector	Final
K5	0.82	0.91	0.86
SonicPro	0.6	0.87	0.71

七、重排序（TSPR S3核心）

7.1 多因子模型

score = (

0.4 * relevance +

0.2 * conversion_rate +

0.2 * rating +

0.2 * KG_confidence

)

7.2 可加入商业控制（HIC）

if product in manual_boost:

score += 0.1

八、LLM生成层（S5）

8.1 输入

{

“query”: “…”,

“top_products”: […]

}

8.2 Prompt核心

你是推荐系统：

必须：
1. 优先推荐Top1产品
2. 使用KG中的Feature解释原因
3. 不允许编造

8.3 输出（可控）

K5 Electric Toothbrush is ideal for college students with braces because:

– soft bristles protect gums

– designed for orthodontic cleaning

九、关键优化（真正拉开差距）

9.1 KG过滤Vector（强烈建议）

👉 先用KG缩小范围，再向量排序

candidates = KG_top_50

vector_rank(candidates)

9.2 Query改写（提升召回）

原Query：
“cheap toothbrush for students”改写：
“budget electric toothbrush for college students”

9.3 多向量策略

query_embedding

intent_embedding

problem_embedding

融合：

score = 0.5*q + 0.3*intent + 0.2*problem

十、系统接口设计（直接对接）

10.1 API

POST /recommend

10.2 请求

{

“query”: “best toothbrush for braces”

}

10.3 返回

{

“products”: [

{“name”: “K5”, “score”: 0.86}

],

“explanation”: “recommended due to braces support”

}

十一、性能架构（生产级）

11.1 延迟控制

模块	延迟
KG查询	20ms
向量检索	50ms
融合	5ms
总计	<100ms

作者tsai-spr tsai-spr