diff --git a/docs/cn/guides/00-products/index.md b/docs/cn/guides/00-products/index.md
index 139596a9cd..9875efa44a 100644
--- a/docs/cn/guides/00-products/index.md
+++ b/docs/cn/guides/00-products/index.md
@@ -12,11 +12,11 @@ import LanguageDocs from '@site/src/components/LanguageDocs';
cn=
'
-**Databend** —— 一个数据库,搞定所有数据。
+**Databend** —— 一套引擎撑起所有数据与场景。
-Databend 是开源的云原生数仓,把数据存储、向量搜索、SQL 分析、全文检索、地理计算都整合到一起,用 SQL 就能操作。兼容 Snowflake 语法,数据存在对象存储里,随写随查,不用来回倒腾。
+Databend 是开源的云原生数仓,把存储、向量搜索、SQL 分析、全文检索与地理计算统一到一套与 Snowflake 兼容的 SQL 接口上。所有数据都放在对象存储里,写入、分析、搜索一次到位,无需折腾多套系统。
-想用就用:云端开个 Databend Cloud,本地跑个 Docker,或者直接 `pip install databend`,都是一套代码,直接读写你的对象存储。
+想上手随时可行:可以直接开通 Databend Cloud,也能本地用 Docker 自建,甚至 `pip install databend` 嵌到现有工程——不论入口如何,运行的都是同一份内核。
'
en=
@@ -32,33 +32,33 @@ Explore the engine on [**GitHub**](https://github.com/databendlabs/databend). La
-**以下是一些您可能感兴趣的入门主题**
+**推荐先从这些主题开始了解**
**快速上手**
-- **[快速开始](/guides/deploy/quickstart)**: 使用 Docker 快速启动 Databend 并加载示例数据。
-- **[Databend Cloud](/guides/cloud)**: 启动无服务器仓库并管理您的组织。
-- **[连接到 Databend](/guides/sql-clients)**: 使用各种 SQL 客户端和编程语言进行连接。
-- **[SQL 参考](/sql)**: 浏览 Databend SQL 命令、函数和语法。
+- **[快速开始](/guides/deploy/quickstart)**: 用 Docker 几分钟内启动 Databend,并加载示例数据。
+- **[Databend Cloud](/guides/cloud)**: 创建无服务器仓库,集中管理组织与资源。
+- **[连接到 Databend](/guides/sql-clients)**: 通过常见 SQL 客户端或编程语言接入 Databend。
+- **[SQL 参考](/sql)**: 查询 Databend 支持的 SQL 命令、函数与语法。
**数据处理**
-- **[数据加载](/guides/load-data)**: 将各种来源的数据导入 Databend。
-- **[数据卸载](/guides/unload-data)**: 将 Databend 中的数据导出为不同格式。
-- **[半结构化数据](/sql/sql-functions/semi-structured-functions)**: 使用 VARIANT 类型处理 JSON、数组和嵌套数据。
+- **[数据加载](/guides/load-data)**: 把不同来源的数据导入 Databend。
+- **[数据卸载](/guides/unload-data)**: 将 Databend 数据导出为所需格式。
+- **[半结构化数据](/sql/sql-functions/semi-structured-functions)**: 借助 VARIANT 处理 JSON、数组与嵌套结构。
-**统一工作负载**
-- **[SQL 分析指南](/guides/query/sql-analytics)**: 用于分析、搜索、向量和地理工作负载的共享会话表。
-- **[JSON 与搜索指南](/guides/query/json-search)**: 使用倒排索引和 Lucene 风格的 `QUERY` 查询 VARIANT 数据。
-- **[向量数据库指南](/guides/query/vector-db)**: 在 Databend 中存储嵌入向量并运行语义相似度搜索。
-- **[地理分析指南](/guides/query/geo-analytics)**: 使用地理空间 SQL 绘制事件地图以获得实时洞察。
-- **[湖仓 ETL 指南](/guides/query/lakehouse-etl)**: 将对象存储文件流式传输到托管表中,无需数据孤岛。
+**统一引擎场景**
+- **[SQL 分析指南](/guides/query/sql-analytics)**: 用同一套引擎支撑分析、搜索、向量与地理任务。
+- **[JSON 与搜索指南](/guides/query/json-search)**: 依托倒排索引和 Elasticsearch 风格 `QUERY` 检索 VARIANT 载荷。
+- **[向量数据库指南](/guides/query/vector-db)**: 在 Databend 内存储嵌入并完成语义相似检索。
+- **[地理分析指南](/guides/query/geo-analytics)**: 借助地理空间 SQL 绘制事件地图,实时定位热点。
+- **[湖仓 ETL 指南](/guides/query/lakehouse-etl)**: 将对象存储文件流式写入托管表,杜绝数据孤岛。
**性能与扩展**
-- **[性能优化](/guides/performance)**: 通过各种策略提升查询性能。
-- **[基准测试](/guides/benchmark)**: 将 Databend 的性能与其他数据仓库进行比较。
-- **[数据湖仓](/sql/sql-reference/table-engines)**: 与 Hive、Iceberg 和 Delta Lake 无缝集成。
+- **[性能优化](/guides/performance)**: 结合多种策略加速查询与计算。
+- **[基准测试](/guides/benchmark)**: 了解 Databend 与其他数据仓库的性能对比。
+- **[数据湖仓](/sql/sql-reference/table-engines)**: 与 Hive、Iceberg、Delta Lake 无缝协作。
**社区与支持**
-- **[加入 Slack](https://link.databend.com/join-slack)**: 与 Databend 社区和核心工程师交流。
-- **[文档问题](https://github.com/databendlabs/databend-docs/issues)**: 报告问题或请求新内容。
-- **[路线图](https://github.com/databendlabs/databend/issues/14167)**: 跟踪即将推出的功能并分享反馈。
-- **[邮件联系](mailto:hi@databend.com)**: 需要帮助时直接联系团队。
+- **[加入 Slack](https://link.databend.com/join-slack)**: 与社区成员及核心工程师直接交流。
+- **[文档问题](https://github.com/databendlabs/databend-docs/issues)**: 反馈文档缺失或提交改进建议。
+- **[路线图](https://github.com/databendlabs/databend/issues/14167)**: 跟踪即将发布的功能并留下意见。
+- **[邮件联系](mailto:hi@databend.com)**: 需要即时协助时写信给我们。
diff --git a/docs/cn/guides/51-ai-functions/_category_.json b/docs/cn/guides/51-ai-functions/_category_.json
index 6293e87178..11f9346364 100644
--- a/docs/cn/guides/51-ai-functions/_category_.json
+++ b/docs/cn/guides/51-ai-functions/_category_.json
@@ -1,3 +1,3 @@
{
- "label": "Databend 人工智能(AI)与机器学习(ML)"
-}
\ No newline at end of file
+ "label": "Databend AI"
+}
diff --git a/docs/cn/guides/54-query/00-sql-analytics.md b/docs/cn/guides/54-query/00-sql-analytics.md
index 64afb2a6fa..8aa1eb616b 100644
--- a/docs/cn/guides/54-query/00-sql-analytics.md
+++ b/docs/cn/guides/54-query/00-sql-analytics.md
@@ -1,277 +1,202 @@
---
-title: SQL 分析(SQL Analytics)
+title: SQL 分析
---
-> **场景(Scenario):** EverDrive Smart Vision 的分析师整理了一组共享的驾驶会话(drive sessions)和关键帧(key frames),使每个下游工作负载都能查询相同的 ID,而无需在系统之间复制数据。
+> **场景:** CityDrive 会把所有行车视频写入共享的关系表,分析师因此可以在同一批 `video_id` / `frame_id` 上做过滤、连接与聚合,供后续的 JSON、向量、地理和 ETL 负载共用。
-本教程将构建一个微型的 **EverDrive Smart Vision** 数据集,并展示 Databend 的单一查询优化器(Query Optimizer)如何在其余指南中发挥作用。您在此处创建的每个 ID(`SES-20240801-SEA01`、`FRAME-0001` …)都会重新出现在 JSON、向量、地理和 ETL 演练中,形成一致的自动驾驶故事。
+本演练建模了 CityDrive 编目中的关系层,并串起常见的 SQL 积木。这里出现的示例 ID 会在其余指南中再次用到。
-## 1. 创建示例表
-两张表分别记录测试会话和从行车记录仪视频中提取的重要帧。
+## 1. 创建基础表
+`citydrive_videos` 保存视频级元数据,而 `frame_events` 记录每段视频里抽出的关键帧。
```sql
-CREATE OR REPLACE TABLE drive_sessions (
- session_id VARCHAR,
- vehicle_id VARCHAR,
- route_name VARCHAR,
- start_time TIMESTAMP,
- end_time TIMESTAMP,
- weather VARCHAR,
- camera_setup VARCHAR
+CREATE OR REPLACE TABLE citydrive_videos (
+ video_id STRING,
+ vehicle_id STRING,
+ capture_date DATE,
+ route_name STRING,
+ weather STRING,
+ camera_source STRING,
+ duration_sec INT
);
CREATE OR REPLACE TABLE frame_events (
- frame_id VARCHAR,
- session_id VARCHAR,
- frame_index INT,
- captured_at TIMESTAMP,
- event_type VARCHAR,
- risk_score DOUBLE
+ frame_id STRING,
+ video_id STRING,
+ frame_index INT,
+ collected_at TIMESTAMP,
+ event_tag STRING,
+ risk_score DOUBLE,
+ speed_kmh DOUBLE
);
-INSERT INTO drive_sessions VALUES
- ('SES-20240801-SEA01', 'VEH-01', 'Seattle → Bellevue → Seattle', '2024-08-01 09:00', '2024-08-01 10:10', 'Sunny', 'Dual 1080p'),
- ('SES-20240802-SEA02', 'VEH-02', 'Downtown Night Loop', '2024-08-02 20:15', '2024-08-02 21:05', 'Light Rain','Night Vision'),
- ('SES-20240803-SEA03', 'VEH-03', 'Harbor Industrial Route', '2024-08-03 14:05', '2024-08-03 15:30', 'Overcast', 'Thermal + RGB');
+INSERT INTO citydrive_videos VALUES
+ ('VID-20250101-001', 'VEH-21', '2025-01-01', 'Downtown Loop', 'Rain', 'roof_cam', 3580),
+ ('VID-20250101-002', 'VEH-05', '2025-01-01', 'Port Perimeter', 'Overcast', 'front_cam',4020),
+ ('VID-20250102-001', 'VEH-21', '2025-01-02', 'Airport Connector', 'Clear', 'front_cam',3655),
+ ('VID-20250103-001', 'VEH-11', '2025-01-03', 'CBD Night Sweep', 'LightFog', 'rear_cam', 3310);
INSERT INTO frame_events VALUES
- ('FRAME-0001', 'SES-20240801-SEA01', 120, '2024-08-01 09:32:15', 'SuddenBrake', 0.82),
- ('FRAME-0002', 'SES-20240801-SEA01', 342, '2024-08-01 09:48:03', 'CrosswalkPedestrian', 0.67),
- ('FRAME-0003', 'SES-20240802-SEA02', 88, '2024-08-02 20:29:41', 'NightLowVisibility', 0.59),
- ('FRAME-0004', 'SES-20240802-SEA02', 214, '2024-08-02 20:48:12', 'EmergencyVehicle', 0.73),
- ('FRAME-0005', 'SES-20240803-SEA03', 305, '2024-08-03 15:02:44', 'CyclistOvertake', 0.64);
+ ('FRAME-0101', 'VID-20250101-001', 125, '2025-01-01 08:15:21', 'hard_brake', 0.81, 32.4),
+ ('FRAME-0102', 'VID-20250101-001', 416, '2025-01-01 08:33:54', 'pedestrian', 0.67, 24.8),
+ ('FRAME-0201', 'VID-20250101-002', 298, '2025-01-01 11:12:02', 'lane_merge', 0.74, 48.1),
+ ('FRAME-0301', 'VID-20250102-001', 188, '2025-01-02 09:44:18', 'hard_brake', 0.59, 52.6),
+ ('FRAME-0401', 'VID-20250103-001', 522, '2025-01-03 21:18:07', 'night_lowlight', 0.63, 38.9);
```
-> 需要回顾表 DDL?请参阅 [CREATE TABLE](/sql/sql-commands/ddl/table/ddl-create-table)。
+文档:[CREATE TABLE](/sql/sql-commands/ddl/table/ddl-create-table)、[INSERT](/sql/sql-commands/dml/dml-insert)。
---
-## 2. 过滤最近会话
-让分析聚焦在最新的驾驶记录上。
+## 2. 只看最新车次
+把调查范围控制在最近 3 天的导航路线。
```sql
-WITH recent_sessions AS (
- SELECT *
- FROM drive_sessions
- WHERE start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP)
+WITH recent_videos AS (
+ SELECT *
+ FROM citydrive_videos
+ WHERE capture_date >= DATEADD('day', -3, TODAY())
)
-SELECT *
-FROM recent_sessions
-ORDER BY start_time DESC;
+SELECT v.video_id,
+ v.route_name,
+ v.weather,
+ COUNT(f.frame_id) AS flagged_frames
+FROM recent_videos v
+LEFT JOIN frame_events f USING (video_id)
+GROUP BY v.video_id, v.route_name, v.weather
+ORDER BY flagged_frames DESC;
```
-尽早过滤可加快后续连接(JOIN)与聚合(GROUP BY)。文档:[WHERE & CASE](/sql/sql-commands/query-syntax/query-select#where-clause)。
+文档:[DATEADD](/sql/sql-functions/datetime-functions/date-add)、[GROUP BY](/sql/sql-commands/query-syntax/query-select#group-by-clause)。
---
-## 3. 连接(JOIN)
-### INNER JOIN ... USING
-合并会话元数据与帧级事件。
-
+## 3. 常见 JOIN 模式
+### INNER JOIN:取帧上下文
```sql
-WITH recent_events AS (
- SELECT *
- FROM frame_events
- WHERE captured_at >= DATEADD('day', -7, CURRENT_TIMESTAMP)
-)
-SELECT e.frame_id,
- e.captured_at,
- e.event_type,
- e.risk_score,
- s.vehicle_id,
- s.route_name,
- s.weather
-FROM recent_events e
-JOIN drive_sessions s USING (session_id)
-ORDER BY e.captured_at;
+SELECT f.frame_id,
+ f.event_tag,
+ f.risk_score,
+ v.route_name,
+ v.camera_source
+FROM frame_events AS f
+JOIN citydrive_videos AS v USING (video_id)
+ORDER BY f.collected_at;
```
-### NOT EXISTS(反连接/Anti Join)
-查找缺少会话元数据的事件。
-
+### NOT EXISTS:做 QA
```sql
SELECT frame_id
-FROM frame_events e
+FROM frame_events f
WHERE NOT EXISTS (
- SELECT 1
- FROM drive_sessions s
- WHERE s.session_id = e.session_id
+ SELECT 1
+ FROM citydrive_videos v
+ WHERE v.video_id = f.video_id
);
```
-### LATERAL FLATTEN(JSON 展开/Unnest)
-将事件与 JSON 载荷中的检测对象合并。
-
+### LATERAL FLATTEN:展开 JSON 检测
```sql
-SELECT e.frame_id,
- obj.value['type']::STRING AS object_type
-FROM frame_events e
-JOIN frame_payloads p USING (frame_id),
- LATERAL FLATTEN(p.payload['objects']) AS obj;
+SELECT f.frame_id,
+ obj.value['type']::STRING AS detected_type,
+ obj.value['confidence']::DOUBLE AS confidence
+FROM frame_events AS f
+JOIN frame_payloads AS p ON f.frame_id = p.frame_id,
+ LATERAL FLATTEN(input => p.payload['objects']) AS obj
+WHERE f.event_tag = 'pedestrian'
+ORDER BY confidence DESC;
```
-更多模式:[JOIN 参考](/sql/sql-commands/query-syntax/query-join)。
+文档:[JOIN](/sql/sql-commands/query-syntax/query-join)、[FLATTEN](/sql/sql-functions/table-functions/flatten)。
---
-## 4. 分组(GROUP BY)
-### GROUP BY route_name, event_type
-标准 `GROUP BY` 比较路线与事件类型。
-
+## 4. 车队 KPI 聚合
+### 分路线的行为统计
```sql
-WITH recent_events AS (
- SELECT *
- FROM frame_events
- WHERE captured_at >= DATEADD('week', -4, CURRENT_TIMESTAMP)
-)
-SELECT route_name,
- event_type,
- COUNT(*) AS event_count,
- AVG(risk_score) AS avg_risk
-FROM recent_events
-JOIN drive_sessions USING (session_id)
-GROUP BY route_name, event_type
-ORDER BY avg_risk DESC, event_count DESC;
+SELECT v.route_name,
+ f.event_tag,
+ COUNT(*) AS occurrences,
+ AVG(f.risk_score) AS avg_risk
+FROM frame_events f
+JOIN citydrive_videos v USING (video_id)
+GROUP BY v.route_name, f.event_tag
+ORDER BY avg_risk DESC, occurrences DESC;
```
-### GROUP BY ROLLUP
-增加路线小计及总计。
-
+### ROLLUP 总计
```sql
-SELECT route_name,
- event_type,
- COUNT(*) AS event_count,
- AVG(risk_score) AS avg_risk
-FROM frame_events
-JOIN drive_sessions USING (session_id)
-GROUP BY ROLLUP(route_name, event_type)
-ORDER BY route_name NULLS LAST, event_type;
+SELECT v.route_name,
+ f.event_tag,
+ COUNT(*) AS occurrences
+FROM frame_events f
+JOIN citydrive_videos v USING (video_id)
+GROUP BY ROLLUP(v.route_name, f.event_tag)
+ORDER BY v.route_name NULLS LAST, f.event_tag;
```
-### GROUP BY CUBE
-生成路线与事件类型的所有组合。
-
+### CUBE:路线 × 天气 覆盖
```sql
-SELECT route_name,
- event_type,
- COUNT(*) AS event_count,
- AVG(risk_score) AS avg_risk
-FROM frame_events
-JOIN drive_sessions USING (session_id)
-GROUP BY CUBE(route_name, event_type)
-ORDER BY route_name NULLS LAST, event_type;
+SELECT v.route_name,
+ v.weather,
+ COUNT(DISTINCT v.video_id) AS videos
+FROM citydrive_videos v
+GROUP BY CUBE(v.route_name, v.weather)
+ORDER BY v.route_name NULLS LAST, v.weather NULLS LAST;
```
---
-## 5. 窗口函数(WINDOW FUNCTION)
-### SUM(...) OVER(运行总计/running total)
-用运行 `SUM` 跟踪每次驾驶的累积风险。
-
+## 5. 窗口函数
+### 单次视频的风险累计
```sql
-WITH session_event_scores AS (
- SELECT session_id,
- captured_at,
- risk_score
- FROM frame_events
+WITH ordered_events AS (
+ SELECT video_id, collected_at, risk_score
+ FROM frame_events
)
-SELECT session_id,
- captured_at,
+SELECT video_id,
+ collected_at,
risk_score,
SUM(risk_score) OVER (
- PARTITION BY session_id
- ORDER BY captured_at
+ PARTITION BY video_id
+ ORDER BY collected_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_risk
-FROM session_event_scores
-ORDER BY session_id, captured_at;
+FROM ordered_events
+ORDER BY video_id, collected_at;
```
-### AVG(...) OVER(移动平均/moving average)
-显示最近三个事件的风险移动平均:
-
+### 帧级滑动平均
```sql
-WITH session_event_scores AS (
- SELECT session_id,
- captured_at,
- risk_score
- FROM frame_events
-)
-SELECT session_id,
- captured_at,
+SELECT video_id,
+ frame_id,
+ frame_index,
risk_score,
AVG(risk_score) OVER (
- PARTITION BY session_id
- ORDER BY captured_at
+ PARTITION BY video_id
+ ORDER BY frame_index
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
- ) AS moving_avg_risk
-FROM session_event_scores
-ORDER BY session_id, captured_at;
+ ) AS rolling_avg_risk
+FROM frame_events
+ORDER BY video_id, frame_index;
```
-窗口函数(Window Functions)让你以内联方式表达滚动总计或平均值。完整列表:[窗口函数(Window Functions)](/sql/sql-functions/window-functions)。
+窗口函数可以在 SQL 中直接表达滚动求和或滑动平均。完整列表见:[窗口函数](/sql/sql-functions/window-functions)。
---
-## 6. 聚合索引加速(Aggregating Index Acceleration)
-用[聚合索引(Aggregating Index)](/guides/performance/aggregating-index)缓存繁重汇总,让仪表盘保持秒级响应。
+## 6. 聚合索引提速
+使用 [Aggregating Index](/guides/performance/aggregating-index) 缓存高频汇总,让仪表盘查询避开全表扫描。
```sql
-CREATE OR REPLACE AGGREGATING INDEX idx_route_event_summary ON frame_events
+CREATE OR REPLACE AGGREGATING INDEX idx_video_event_summary
AS
-SELECT session_id,
- event_type,
+SELECT video_id,
+ event_tag,
COUNT(*) AS event_count,
AVG(risk_score) AS avg_risk
FROM frame_events
-GROUP BY session_id, event_type;
+GROUP BY video_id, event_tag;
```
-再次运行相同的汇总查询——优化器将自动命中索引:
-
-```sql
-SELECT s.route_name,
- e.event_type,
- COUNT(*) AS event_count,
- AVG(e.risk_score) AS avg_risk
-FROM frame_events e
-JOIN drive_sessions s USING (session_id)
-WHERE s.start_time >= DATEADD('week', -8, CURRENT_TIMESTAMP)
-GROUP BY s.route_name, e.event_type
-ORDER BY avg_risk DESC;
-```
-
-`EXPLAIN` 该语句可看到 `AggregatingIndex` 节点而非全表扫描。Databend 在新帧到达时自动刷新索引,无需额外 ETL 即可实现亚秒级仪表盘体验。
-
----
-
-## 7. 存储过程自动化(Stored Procedure Automation)
-将报告逻辑封装到存储过程(Stored Procedure)中,确保在定时任务中按预期执行。
-
-```sql
-CREATE OR REPLACE PROCEDURE generate_weekly_route_report(days_back INT)
-RETURNS TABLE(route_name VARCHAR, event_count BIGINT, avg_risk DOUBLE)
-LANGUAGE SQL
-AS
-$$
-BEGIN
- RETURN TABLE (
- SELECT s.route_name,
- COUNT(*) AS event_count,
- AVG(e.risk_score) AS avg_risk
- FROM frame_events e
- JOIN drive_sessions s USING (session_id)
- WHERE e.captured_at >= DATEADD('day', -days_back, CURRENT_TIMESTAMP)
- GROUP BY s.route_name
- );
-END;
-$$;
-
-CALL PROCEDURE generate_weekly_route_report(28);
-```
-
-返回的结果集可直接用于笔记本、ETL 任务或自动告警。了解更多:[存储过程脚本(Stored Procedure Scripting)](/sql/stored-procedure-scripting)。
-
----
-
-至此,您已拥有完整闭环:摄取会话数据、过滤、连接、聚合、加速重查询、趋势分析并发布。只需替换过滤条件或连接方式,即可将同一套方案应用于驾驶员评分、传感器退化或算法对比等其他智能驾驶 KPI。
\ No newline at end of file
+当你再次运行相同的汇总(如路线事件分布)时,`EXPLAIN` 会显示 `AggregatingIndex` 节点,说明查询已经命中上面的摘要副本。索引会在新的帧写入后自动刷新,无须额外 ETL 即可保持秒级体验。
diff --git a/docs/cn/guides/54-query/01-json-search.md b/docs/cn/guides/54-query/01-json-search.md
index 11d1202079..c33105526e 100644
--- a/docs/cn/guides/54-query/01-json-search.md
+++ b/docs/cn/guides/54-query/01-json-search.md
@@ -1,140 +1,77 @@
---
-title: JSON 与搜索(Search)
+title: JSON 与搜索
---
-> **场景(Scenario):** EverDrive Smart Vision 的感知服务会为每个观察到的帧发出 JSON 有效载荷(payloads),安全分析师需要在不将数据移出 Databend 的情况下搜索检测结果。
+> **场景:** CityDrive 会为每个抽取出来的帧附带一份 JSON 元数据,并希望直接在 Databend 内用 Elasticsearch 风格的过滤语法完成检索,而不用把数据复制到别的系统。
-EverDrive 的感知 Pipeline(流水线)会发出 JSON 有效载荷,我们可以使用 Elasticsearch 风格的语法进行查询。通过将有效载荷存储为 VARIANT 类型并在创建表时声明倒排索引(inverted index),Databend 允许您直接在数据上运行 Lucene 的 `QUERY` 过滤器。
+Databend 可以在同一仓库里托管多模态信号:VARIANT 列支持倒排索引,位图表刻画标签覆盖率,向量索引用于相似度查询,原生 GEOMETRY 列提供空间过滤。
-## 1. 创建示例表
-每个帧都携带着来自感知模型(边界框、速度、分类)的结构化元数据。
+## 1. 创建元数据表
+每个帧保存一份 JSON,有了共同的结构,任意查询都可以复用。
```sql
-CREATE OR REPLACE TABLE frame_payloads (
- frame_id VARCHAR,
- run_stage VARCHAR,
- payload VARIANT,
- logged_at TIMESTAMP,
- INVERTED INDEX idx_frame_payloads(payload) -- 声明倒排索引(inverted index)
-);
-
-INSERT INTO frame_payloads VALUES
- ('FRAME-0001', 'detection', PARSE_JSON('{
- "objects": [
- {"type":"vehicle","bbox":[545,220,630,380],"confidence":0.94},
- {"type":"pedestrian","bbox":[710,200,765,350],"confidence":0.88}
- ],
- "ego": {"speed_kmh": 32.5, "accel": -2.1}
- }'), '2024-08-01 09:32:16'),
- ('FRAME-0002', 'detection', PARSE_JSON('{
- "objects": [
- {"type":"pedestrian","bbox":[620,210,670,360],"confidence":0.91}
- ],
- "scene": {"lighting":"daytime","weather":"sunny"}
- }'), '2024-08-01 09:48:04'),
- ('FRAME-0003', 'tracking', PARSE_JSON('{
- "objects": [
- {"type":"vehicle","speed_kmh": 18.0,"distance_m": 6.2},
- {"type":"emergency_vehicle","sirens":true}
- ],
- "scene": {"lighting":"night","visibility":"low"}
- }'), '2024-08-02 20:29:42');
-```
-
-## 2. 提取 JSON 路径
-查看有效载荷以确认结构。
-
-```sql
-SELECT frame_id,
- payload['objects'][0]['type']::STRING AS first_object,
- payload['ego']['speed_kmh']::DOUBLE AS ego_speed,
- payload['scene']['lighting']::STRING AS lighting
-FROM frame_payloads
-ORDER BY logged_at;
+CREATE DATABASE IF NOT EXISTS video_unified_demo;
+USE video_unified_demo;
+
+CREATE OR REPLACE TABLE frame_metadata_catalog (
+ doc_id STRING,
+ meta_json VARIANT,
+ captured_at TIMESTAMP,
+ INVERTED INDEX idx_meta_json (meta_json)
+) CLUSTER BY (captured_at);
```
-使用 `::STRING` / `::DOUBLE` 进行类型转换(Casting)可以将 JSON 值暴露给常规的 SQL 过滤器。Databend 还通过 `QUERY` 函数支持在此数据之上进行 Elasticsearch 风格的搜索——通过在变体字段前加上列名(例如 `payload.objects.type`)来引用它们。更多提示:[加载半结构化数据](/guides/load-data/load-semistructured/load-ndjson)。
-
----
-
-## 3. Elasticsearch 风格的搜索(Search)
-`QUERY` 使用 Elasticsearch/Lucene 语法,因此您可以组合布尔逻辑、范围、权重(boosts)和列表。以下是 EverDrive 有效载荷上的几种模式:
-
-### 数组匹配(Array Match)
-查找检测到行人的帧:
+> 需要同时管理多模态数据(向量嵌入、GPS 轨迹、标签位图)?可以直接复用 [向量](./02-vector-db.md) 与 [地理](./03-geo-analytics.md) 指南里的建表语句,再同 JSON 结果拼接。
+## 2. 使用 `QUERY()` 的检索模式
+### 数组匹配
```sql
-SELECT frame_id
-FROM frame_payloads
-WHERE QUERY('payload.objects.type:pedestrian')
-ORDER BY logged_at DESC
-LIMIT 10;
+SELECT doc_id,
+ captured_at,
+ meta_json['detections'] AS detections
+FROM frame_metadata_catalog
+WHERE QUERY('meta_json.detections.objects.type:pedestrian')
+ORDER BY captured_at DESC
+LIMIT 5;
```
### 布尔 AND
-车辆行驶速度大于 30 km/h **且** 检测到行人:
-
```sql
-SELECT frame_id,
- payload['ego']['speed_kmh']::DOUBLE AS ego_speed
-FROM frame_payloads
-WHERE QUERY('payload.objects.type:pedestrian AND payload.ego.speed_kmh:[30 TO *]')
-ORDER BY ego_speed DESC;
+SELECT doc_id, captured_at
+FROM frame_metadata_catalog
+WHERE QUERY('meta_json.scene.weather_code:rain
+ AND meta_json.camera.sensor_view:roof')
+ORDER BY captured_at;
```
### 布尔 OR / 列表
-夜间驾驶遇到紧急车辆或骑自行车的人:
-
```sql
-SELECT frame_id
-FROM frame_payloads
-WHERE QUERY('payload.scene.lighting:night AND payload.objects.type:(emergency_vehicle OR cyclist)');
+SELECT doc_id,
+ meta_json['media_meta']['tagging']['labels'] AS labels
+FROM frame_metadata_catalog
+WHERE QUERY('meta_json.media_meta.tagging.labels:(hard_brake OR swerve OR lane_merge)')
+ORDER BY captured_at DESC
+LIMIT 10;
```
### 数值范围
-速度在 10–25 km/h 之间(包含)或严格在 25–40 km/h 之间:
-
```sql
-SELECT frame_id,
- payload['ego']['speed_kmh'] AS speed
-FROM frame_payloads
-WHERE QUERY('payload.ego.speed_kmh:[10 TO 25] OR payload.ego.speed_kmh:{25 TO 40}')
-ORDER BY speed;
+SELECT doc_id,
+ meta_json['vehicle']['speed_kmh']::DOUBLE AS speed
+FROM frame_metadata_catalog
+WHERE QUERY('meta_json.vehicle.speed_kmh:{30 TO 80}')
+ORDER BY speed DESC
+LIMIT 10;
```
### 权重(Boosting)
-优先考虑同时出现行人和车辆的帧,但强调行人项:
-
```sql
-SELECT frame_id,
+SELECT doc_id,
SCORE() AS relevance
-FROM frame_payloads
-WHERE QUERY('payload.objects.type:pedestrian^2 AND payload.objects.type:vehicle')
+FROM frame_metadata_catalog
+WHERE QUERY('meta_json.scene.weather_code:rain AND (meta_json.media_meta.tagging.labels:hard_brake^2 OR meta_json.media_meta.tagging.labels:swerve)')
ORDER BY relevance DESC
-LIMIT 10;
-```
-
-请参阅 [搜索函数](/sql/sql-functions/search-functions) 以了解 `QUERY`、`SCORE()` 和相关辅助函数支持的完整 Elasticsearch 语法。
-
----
-
-## 4. 交叉引用帧事件
-将查询结果连接回在分析指南中创建的帧级风险评分。
-
-```sql
-WITH risky_frames AS (
- SELECT frame_id,
- payload['ego']['speed_kmh']::DOUBLE AS ego_speed
- FROM frame_payloads
- WHERE QUERY('payload.objects.type:pedestrian AND payload.ego.speed_kmh:[30 TO *]')
-)
-SELECT r.frame_id,
- e.event_type,
- e.risk_score,
- r.ego_speed
-FROM risky_frames r
-JOIN frame_events e USING (frame_id)
-ORDER BY e.risk_score DESC;
+LIMIT 8;
```
-由于 `frame_id` 在表之间共享,您可以立即从原始有效载荷跳转到精选分析结果。
\ No newline at end of file
+`QUERY()` 遵循 Elasticsearch 的语义(布尔逻辑、范围、权重、列表等),`SCORE()` 则暴露检索相关性,方便在 SQL 里直接排序。完整算子列表见:[搜索函数](/sql/sql-functions/search-functions)。
diff --git a/docs/cn/guides/54-query/02-vector-db.md b/docs/cn/guides/54-query/02-vector-db.md
index 34e38f4395..5f3e691fe9 100644
--- a/docs/cn/guides/54-query/02-vector-db.md
+++ b/docs/cn/guides/54-query/02-vector-db.md
@@ -1,95 +1,99 @@
---
-title: 向量搜索(Vector Search)
+title: 向量搜索
---
-> **场景:** EverDrive Smart Vision 将紧凑的视觉嵌入(vision embeddings)附加到高风险帧,以便调查团队直接在 Databend 内检索相似场景。
+> **场景:** CityDrive 把每个帧的嵌入直接存放在 Databend,语义相似搜索(“找出和它看起来像的帧”)便可与传统 SQL 分析一同运行,无需再部署独立的向量服务。
-每帧都附带视觉嵌入,感知工程师可借此发现相似情况。本指南演示如何插入这些向量,并在同一 EverDrive ID 上执行语义搜索。
+`frame_embeddings` 表与 `frame_events`、`frame_payloads`、`frame_geo_points` 共用同一批 `frame_id`,让语义检索与常规 SQL 牢牢绑定在一起。
-## 1. 创建示例表
-为便于阅读,示例使用四维向量。生产环境中可保存 CLIP 或自监督模型输出的 512 维或 1536 维嵌入。
+## 1. 准备嵌入表
+生产模型通常输出 512–1536 维,本例使用 512 维方便直接复制到演示集群。
```sql
CREATE OR REPLACE TABLE frame_embeddings (
- frame_id VARCHAR,
- session_id VARCHAR,
- embedding VECTOR(4),
- model_version VARCHAR,
- created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
- VECTOR INDEX idx_frame_embeddings(embedding) distance='cosine'
+ frame_id STRING,
+ video_id STRING,
+ sensor_view STRING,
+ embedding VECTOR(512),
+ encoder_build STRING,
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+ VECTOR INDEX idx_frame_embeddings(embedding) distance='cosine'
);
INSERT INTO frame_embeddings VALUES
- ('FRAME-0001', 'SES-20240801-SEA01', [0.18, 0.42, 0.07, 0.12]::VECTOR(4), 'clip-mini-v1', DEFAULT),
- ('FRAME-0002', 'SES-20240801-SEA01', [0.20, 0.38, 0.12, 0.18]::VECTOR(4), 'clip-mini-v1', DEFAULT),
- ('FRAME-0003', 'SES-20240802-SEA02', [0.62, 0.55, 0.58, 0.61]::VECTOR(4), 'night-fusion-v2', DEFAULT),
- ('FRAME-0004', 'SES-20240802-SEA02', [0.57, 0.49, 0.52, 0.55]::VECTOR(4), 'night-fusion-v2', DEFAULT);
+ ('FRAME-0101', 'VID-20250101-001', 'roof_cam', RANDOM_VECTOR(512), 'clip-lite-v1', DEFAULT),
+ ('FRAME-0102', 'VID-20250101-001', 'roof_cam', RANDOM_VECTOR(512), 'clip-lite-v1', DEFAULT),
+ ('FRAME-0201', 'VID-20250101-002', 'front_cam',RANDOM_VECTOR(512), 'night-fusion-v2', DEFAULT),
+ ('FRAME-0401', 'VID-20250103-001', 'rear_cam', RANDOM_VECTOR(512), 'night-fusion-v2', DEFAULT);
```
-文档:[向量数据类型(Vector data type)](/sql/sql-reference/data-types/vector) 与 [向量索引(Vector index)](/sql/sql-reference/data-types/vector#vector-indexing)。
+文档:[向量类型](/sql/sql-reference/data-types/vector)、[向量索引](/sql/sql-reference/data-types/vector#vector-indexing)。
---
-## 2. COSINE_DISTANCE 搜索
-查找与 `FRAME-0001` 最相似的帧。
+## 2. 运行余弦搜索
+先取出某一帧的嵌入,再让 HNSW 索引返回最近邻。
```sql
WITH query_embedding AS (
- SELECT embedding
- FROM frame_embeddings
- WHERE frame_id = 'FRAME-0001'
- LIMIT 1
+ SELECT embedding
+ FROM frame_embeddings
+ WHERE frame_id = 'FRAME-0101'
)
SELECT e.frame_id,
- e.session_id,
- cosine_distance(e.embedding, q.embedding) AS distance
-FROM frame_embeddings e
-CROSS JOIN query_embedding q
+ e.video_id,
+ COSINE_DISTANCE(e.embedding, q.embedding) AS distance
+FROM frame_embeddings AS e
+CROSS JOIN query_embedding AS q
ORDER BY distance
LIMIT 3;
```
-余弦距离计算将利用先前创建的 HNSW 索引,优先返回最近邻帧。
+距离越小越相似。即便有数百万帧,`VECTOR INDEX` 也能让响应保持毫秒级。
----
-
-## 3. WHERE 过滤 + 相似度
-结合相似度搜索与传统谓词,缩小结果范围。
+继续叠加传统谓词(如路线、视频、传感器视角),即可在向量比对前后收窄候选集。
```sql
WITH query_embedding AS (
- SELECT embedding
- FROM frame_embeddings
- WHERE frame_id = 'FRAME-0003'
- LIMIT 1
+ SELECT embedding
+ FROM frame_embeddings
+ WHERE frame_id = 'FRAME-0201'
)
SELECT e.frame_id,
- cosine_distance(e.embedding, q.embedding) AS distance
-FROM frame_embeddings e
-CROSS JOIN query_embedding q
-WHERE e.session_id = 'SES-20240802-SEA02'
-ORDER BY distance;
+ e.sensor_view,
+ COSINE_DISTANCE(e.embedding, q.embedding) AS distance
+FROM frame_embeddings AS e
+CROSS JOIN query_embedding AS q
+WHERE e.sensor_view = 'rear_cam'
+ORDER BY distance
+LIMIT 5;
```
+优化器会在满足 `sensor_view` 过滤的同时继续走向量索引。
+
---
-## 4. JOIN 语义 + 风险元数据
-将语义结果与风险评分或检测载荷关联,丰富调查维度。
+## 3. 丰富相似帧
+把 Top-N 相似帧物化,再与 `frame_events` 连接,方便下游分析。
```sql
WITH query_embedding AS (
- SELECT embedding FROM frame_embeddings WHERE frame_id = 'FRAME-0001' LIMIT 1
+ SELECT embedding
+ FROM frame_embeddings
+ WHERE frame_id = 'FRAME-0102'
),
similar_frames AS (
- SELECT frame_id,
- cosine_distance(e.embedding, q.embedding) AS distance
+ SELECT frame_id,
+ video_id,
+ COSINE_DISTANCE(e.embedding, q.embedding) AS distance
FROM frame_embeddings e
CROSS JOIN query_embedding q
ORDER BY distance
LIMIT 5
)
SELECT sf.frame_id,
- fe.event_type,
+ sf.video_id,
+ fe.event_tag,
fe.risk_score,
sf.distance
FROM similar_frames sf
@@ -97,4 +101,4 @@ LEFT JOIN frame_events fe USING (frame_id)
ORDER BY sf.distance;
```
-该混合视图呈现“外观类似 FRAME-0001 且触发高风险事件的帧”。
\ No newline at end of file
+嵌入与关系表同库共存,调查人员可以立即从“视觉相似”跳转到“同时伴随 `hard_brake` 标签、特定天气或 JSON 检测”的线索,无需导出数据。
diff --git a/docs/cn/guides/54-query/03-geo-analytics.md b/docs/cn/guides/54-query/03-geo-analytics.md
index 239caded11..0ab247ff1d 100644
--- a/docs/cn/guides/54-query/03-geo-analytics.md
+++ b/docs/cn/guides/54-query/03-geo-analytics.md
@@ -1,93 +1,98 @@
---
-title: 地理空间分析(Geo Analytics)
+title: 地理分析
---
-> **场景(Scenario):** EverDrive Smart Vision 会记录每个关键帧的 GPS 坐标,以便运营团队在城市中绘制危险驾驶热点图。
+> **场景:** CityDrive 会为每个被标记的帧记录精准的 GPS 定位以及与信号灯的距离,运营人员可以纯 SQL 回答“事故发生在什么位置?”之类的问题。
-每帧都带有 GPS 坐标,因此我们可以把危险情况映射到整个城市。本指南新增一张地理空间表,并使用相同的 EverDrive 会话 ID 演示空间过滤、多边形和 H3 分桶。
+`frame_geo_points` 与 `signal_contact_points` 同样复用本指南里的 `video_id` / `frame_id`,因此可以在不复制数据的情况下把 SQL 指标延伸到地图视图。
-## 1. 创建示例表
-每条记录表示捕获关键帧时自车(ego vehicle)的位置。将坐标存储为 `GEOMETRY` 类型,即可复用本工作负载中的 `ST_X`、`ST_Y` 和 `HAVERSINE` 等函数。
+## 1. 创建位置表
+如果你已完成 JSON 指南,这些表应该已经存在。下方片段包含表结构以及几条深圳示例数据。
```sql
-CREATE OR REPLACE TABLE drive_geo (
- frame_id VARCHAR,
- session_id VARCHAR,
- location GEOMETRY,
- speed_kmh DOUBLE,
- heading_deg DOUBLE
+CREATE OR REPLACE TABLE frame_geo_points (
+ video_id STRING,
+ frame_id STRING,
+ position_wgs84 GEOMETRY,
+ solution_grade INT,
+ source_system STRING,
+ created_at TIMESTAMP
);
-INSERT INTO drive_geo VALUES
- ('FRAME-0001', 'SES-20240801-SEA01', TO_GEOMETRY('SRID=4326;POINT(-122.3321 47.6062)'), 28.0, 90),
- ('FRAME-0002', 'SES-20240801-SEA01', TO_GEOMETRY('SRID=4326;POINT(-122.3131 47.6105)'), 35.4, 120),
- ('FRAME-0003', 'SES-20240802-SEA02', TO_GEOMETRY('SRID=4326;POINT(-122.3419 47.6205)'), 18.5, 45),
- ('FRAME-0004', 'SES-20240802-SEA02', TO_GEOMETRY('SRID=4326;POINT(-122.3490 47.6138)'), 22.3, 60),
- ('FRAME-0005', 'SES-20240803-SEA03', TO_GEOMETRY('SRID=4326;POINT(-122.3610 47.6010)'), 30.1, 210);
+INSERT INTO frame_geo_points VALUES
+ ('VID-20250101-001','FRAME-0101',TO_GEOMETRY('SRID=4326;POINT(114.0579 22.5431)'),104,'fusion_gnss','2025-01-01 08:15:21'),
+ ('VID-20250101-001','FRAME-0102',TO_GEOMETRY('SRID=4326;POINT(114.0610 22.5460)'),104,'fusion_gnss','2025-01-01 08:33:54'),
+ ('VID-20250101-002','FRAME-0201',TO_GEOMETRY('SRID=4326;POINT(114.1040 22.5594)'),104,'fusion_gnss','2025-01-01 11:12:02'),
+ ('VID-20250102-001','FRAME-0301',TO_GEOMETRY('SRID=4326;POINT(114.0822 22.5368)'),104,'fusion_gnss','2025-01-02 09:44:18'),
+ ('VID-20250103-001','FRAME-0401',TO_GEOMETRY('SRID=4326;POINT(114.1195 22.5443)'),104,'fusion_gnss','2025-01-03 21:18:07');
+
+CREATE OR REPLACE TABLE signal_contact_points (
+ node_id STRING,
+ signal_position GEOMETRY,
+ video_id STRING,
+ frame_id STRING,
+ frame_position GEOMETRY,
+ distance_m DOUBLE,
+ created_at TIMESTAMP
+);
```
文档:[地理空间数据类型](/sql/sql-reference/data-types/geospatial)。
---
-## 2. ST_DISTANCE 半径过滤
-`ST_DISTANCE` 函数用于测量几何体之间的距离。将帧位置和热点均转换到 Web Mercator(SRID 3857),结果以米为单位,再过滤 500 米以内。
+## 2. 空间过滤
+可计算帧与市中心坐标的距离,或检查它是否落在多边形内部。需要以米为单位时,把坐标投影到 SRID 3857。
```sql
-SELECT g.frame_id,
- g.session_id,
- e.event_type,
- e.risk_score,
+SELECT l.frame_id,
+ l.video_id,
+ f.event_tag,
ST_DISTANCE(
- ST_TRANSFORM(g.location, 3857),
- ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(-122.3350 47.6080)'), 3857)
- ) AS meters_from_hotspot
-FROM drive_geo g
-JOIN frame_events e USING (frame_id)
+ ST_TRANSFORM(l.position_wgs84, 3857),
+ ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(114.0600 22.5450)'), 3857)
+ ) AS meters_from_hq
+FROM frame_geo_points AS l
+JOIN frame_events AS f USING (frame_id)
WHERE ST_DISTANCE(
- ST_TRANSFORM(g.location, 3857),
- ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(-122.3350 47.6080)'), 3857)
- ) <= 500
-ORDER BY meters_from_hotspot;
+ ST_TRANSFORM(l.position_wgs84, 3857),
+ ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(114.0600 22.5450)'), 3857)
+ ) <= 400
+ORDER BY meters_from_hq;
```
-需要原始几何调试?在投影中加入 `ST_ASTEXT(g.location)`。偏好直接的大圆计算?改用 `HAVERSINE` 函数,它直接操作 `ST_X`/`ST_Y` 坐标。
-
----
-
-## 3. ST_CONTAINS 多边形过滤
-检查事件是否发生在划定安全区内(如学校区域)。
+调试时可以输出 `ST_ASTEXT(l.position_wgs84)`,若偏好直接使用球面距离,可改用 [`HAVERSINE`](/sql/sql-functions/geospatial-functions#trigonometric-distance-functions)。
```sql
WITH school_zone AS (
- SELECT TO_GEOMETRY('SRID=4326;POLYGON((
- -122.3415 47.6150,
- -122.3300 47.6150,
- -122.3300 47.6070,
- -122.3415 47.6070,
- -122.3415 47.6150
- ))') AS poly
+ SELECT TO_GEOMETRY('SRID=4326;POLYGON((
+ 114.0505 22.5500,
+ 114.0630 22.5500,
+ 114.0630 22.5420,
+ 114.0505 22.5420,
+ 114.0505 22.5500
+ ))') AS poly
)
-SELECT g.frame_id,
- g.session_id,
- e.event_type
-FROM drive_geo g
-JOIN frame_events e USING (frame_id)
+SELECT l.frame_id,
+ l.video_id,
+ f.event_tag
+FROM frame_geo_points AS l
+JOIN frame_events AS f USING (frame_id)
CROSS JOIN school_zone
-WHERE ST_CONTAINS(poly, g.location);
+WHERE ST_CONTAINS(poly, l.position_wgs84);
```
---
-## 4. GEO_TO_H3 热力图
-按六边形单元聚合事件,构建路线热力图。
+## 3. 六边形聚合
+把风险帧聚合进 H3 单元,用于仪表盘或热力图。
```sql
-SELECT GEO_TO_H3(ST_X(location), ST_Y(location), 8) AS h3_cell,
+SELECT GEO_TO_H3(ST_X(position_wgs84), ST_Y(position_wgs84), 8) AS h3_cell,
COUNT(*) AS frame_count,
- AVG(e.risk_score) AS avg_risk
-FROM drive_geo
-JOIN frame_events e USING (frame_id)
+ AVG(f.risk_score) AS avg_risk
+FROM frame_geo_points AS l
+JOIN frame_events AS f USING (frame_id)
GROUP BY h3_cell
ORDER BY avg_risk DESC;
```
@@ -96,44 +101,56 @@ ORDER BY avg_risk DESC;
---
-## 5. ST_DISTANCE + JSON 查询
-将空间距离检查与丰富的检测元数据(来自 JSON 指南)结合,生成精准告警。
+## 4. 交通信号上下文
+连接 `signal_contact_points` 与 `frame_geo_points`,即可验证存量指标或把空间条件与 JSON 搜索联动。
```sql
-WITH near_intersection AS (
- SELECT frame_id
- FROM drive_geo
- WHERE ST_DISTANCE(
- ST_TRANSFORM(location, 3857),
- ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(-122.3410 47.6130)'), 3857)
- ) <= 200
+SELECT t.node_id,
+ t.video_id,
+ t.frame_id,
+ ST_DISTANCE(t.signal_position, t.frame_position) AS recomputed_distance,
+ t.distance_m AS stored_distance,
+ l.source_system
+FROM signal_contact_points AS t
+JOIN frame_geo_points AS l USING (frame_id)
+WHERE t.distance_m < 0.03 -- 不同投影下约等于 30 米
+ORDER BY t.distance_m;
+```
+
+```sql
+WITH near_junction AS (
+ SELECT frame_id
+ FROM frame_geo_points
+ WHERE ST_DISTANCE(
+ ST_TRANSFORM(position_wgs84, 3857),
+ ST_TRANSFORM(TO_GEOMETRY('SRID=4326;POINT(114.0700 22.5400)'), 3857)
+ ) <= 150
)
-SELECT n.frame_id,
- p.payload['objects'][0]['type']::STRING AS first_object,
- e.event_type,
- e.risk_score
-FROM near_intersection n
-JOIN frame_payloads p USING (frame_id)
-JOIN frame_events e USING (frame_id)
-WHERE QUERY('payload.objects.type:pedestrian');
+SELECT f.frame_id,
+ f.event_tag,
+ meta.meta_json['media_meta']['tagging']['labels'] AS labels
+FROM near_junction nj
+JOIN frame_events AS f USING (frame_id)
+JOIN frame_metadata_catalog AS meta
+ ON meta.doc_id = nj.frame_id
+WHERE QUERY('meta_json.media_meta.tagging.labels:hard_brake');
```
-空间过滤器、JSON 运算符与经典 SQL 均可在一句话内完成。
+这类模式可以先按地理范围筛选,再对剩余帧执行 JSON 搜索。
---
-## 6. 创建视图热力图
-将六边形级摘要导出到可视化工具或地图图层。
+## 5. 发布热力视图
+把空间摘要封装成视图,供 BI 或 GIS 工具直接查询。
```sql
-CREATE OR REPLACE VIEW v_route_heatmap AS (
- SELECT GEO_TO_H3(ST_X(location), ST_Y(location), 7) AS h3_cell,
- COUNT(*) AS frames,
- AVG(e.risk_score) AS avg_risk
- FROM drive_geo
- JOIN frame_events e USING (frame_id)
- GROUP BY h3_cell
-);
+CREATE OR REPLACE VIEW v_citydrive_geo_heatmap AS
+SELECT GEO_TO_H3(ST_X(position_wgs84), ST_Y(position_wgs84), 7) AS h3_cell,
+ COUNT(*) AS frames,
+ AVG(f.risk_score) AS avg_risk
+FROM frame_geo_points AS l
+JOIN frame_events AS f USING (frame_id)
+GROUP BY h3_cell;
```
-下游系统可直接查询 `v_route_heatmap`,在地图上渲染风险热点,无需重新处理原始遥测数据。
\ No newline at end of file
+同一批 `video_id` 现在既能支撑向量、文本,也能支撑空间查询,调查团队不再需要维护额外的管道。
diff --git a/docs/cn/guides/54-query/04-lakehouse-etl.md b/docs/cn/guides/54-query/04-lakehouse-etl.md
index 3012fe00bd..6de3620406 100644
--- a/docs/cn/guides/54-query/04-lakehouse-etl.md
+++ b/docs/cn/guides/54-query/04-lakehouse-etl.md
@@ -1,186 +1,224 @@
---
-title: 湖仓一体 ETL(Lakehouse ETL)
+title: 湖仓 ETL
---
-> **场景(Scenario):** EverDrive Smart Vision 的数据工程团队将每次路测批次导出为 Parquet 文件,以便统一工作负载在 Databend 内加载、查询并丰富同一份遥测数据。
+> **场景:** CityDrive 的数据工程团队会把每一批行车录像导出成 Parquet(视频、帧事件、JSON 元数据、嵌入、GPS 轨迹、信号灯距离),希望用一套 COPY 流程将共享表刷新到 Databend。
-EverDrive 的摄取循环非常简单:
+加载闭环非常直接:
```
-对象存储导出(例如 Parquet)→ Stage → COPY INTO →(可选)Stream & Task
+对象存储 → STAGE → COPY INTO 表 → (可选)STREAMS / TASKS
```
-调整桶路径/凭据(如格式不同,把 Parquet 换成实际格式),然后粘贴下方命令。所有语法均与官方[加载数据指南](/guides/load-data/)一致。
+根据自己的桶路径或格式进行调整,然后直接执行下面的 SQL。语法与[加载数据指南](/guides/load-data/)一致。
---
-## 1. Stage
-EverDrive 的数据工程团队每批次导出四个文件——sessions、frame events、detection payloads(含嵌套 JSON 字段)和 frame embeddings——到 S3 桶。本指南以 Parquet 为例,只需修改 `FILE_FORMAT` 即可接入 CSV、JSON 或其他支持的格式。一次性创建命名连接,后续所有 Stage 复用。
+## 1. 创建 Stage
+为 CityDrive 导出的桶创建可复用的 Stage。示例使用 Parquet,你可以改成任意受支持的格式。
```sql
-CREATE OR REPLACE CONNECTION everdrive_s3
+CREATE OR REPLACE CONNECTION citydrive_s3
STORAGE_TYPE = 's3'
ACCESS_KEY_ID = ''
SECRET_ACCESS_KEY = '';
-CREATE OR REPLACE STAGE drive_stage
- URL = 's3://everdrive-lakehouse/raw/'
- CONNECTION = (CONNECTION_NAME = 'everdrive_s3')
+CREATE OR REPLACE STAGE citydrive_stage
+ URL = 's3://citydrive-lakehouse/raw/'
+ CONNECTION = (CONNECTION_NAME = 'citydrive_s3')
FILE_FORMAT = (TYPE = 'PARQUET');
```
-更多选项见[创建 Stage](/sql/sql-commands/ddl/stage/ddl-create-stage)。
+> [!IMPORTANT]
+> 请把示例中的 AWS 密钥与桶地址替换成真实值,否则 `LIST`、`SELECT ... FROM @citydrive_stage`、`COPY INTO` 都会因为 403/`InvalidAccessKeyId` 失败。
-列出导出文件夹(本示例为 Parquet)确认可见:
+快速检查:
```sql
-LIST @drive_stage/sessions/;
-LIST @drive_stage/frame-events/;
-LIST @drive_stage/payloads/;
-LIST @drive_stage/embeddings/;
+LIST @citydrive_stage/videos/;
+LIST @citydrive_stage/frame-events/;
+LIST @citydrive_stage/manifests/;
+LIST @citydrive_stage/frame-embeddings/;
+LIST @citydrive_stage/frame-locations/;
+LIST @citydrive_stage/traffic-lights/;
```
---
-## 2. Preview
-加载前先查看 Parquet 文件,验证 schema 并抽样。
+## 2. 预览文件
+在装载前对 Stage 做一次 `SELECT`,确认 schema 与样例行。
```sql
SELECT *
-FROM @drive_stage/sessions/session_2024_08_16.parquet
+FROM @citydrive_stage/videos/capture_date=2025-01-01/videos.parquet
LIMIT 5;
SELECT *
-FROM @drive_stage/frame-events/frame_events_2024_08_16.parquet
+FROM @citydrive_stage/frame-events/batch_2025_01_01.parquet
LIMIT 5;
```
-按需对 payloads 与 embeddings 重复预览。Databend 会自动使用 Stage 上指定的文件格式。
+Databend 会沿用 Stage 定义的文件格式,因此无需额外参数。
---
-## 3. COPY INTO
-将各文件加载到指南用到的表中。通过内联类型转换把输入列映射到表列;下方投影以 Parquet 为例,其他格式同理。
+## 3. COPY INTO 统一表
+每份导出都对应指南里的一张共享表。内联的 `::TYPE` 转换可以保证上下游 schema 一致。
-### Sessions
+### `citydrive_videos`
```sql
-COPY INTO drive_sessions (session_id, vehicle_id, route_name, start_time, end_time, weather, camera_setup)
+COPY INTO citydrive_videos (video_id, vehicle_id, capture_date, route_name, weather, camera_source, duration_sec)
FROM (
- SELECT session_id::STRING,
+ SELECT video_id::STRING,
vehicle_id::STRING,
+ capture_date::DATE,
route_name::STRING,
- start_time::TIMESTAMP,
- end_time::TIMESTAMP,
weather::STRING,
- camera_setup::STRING
- FROM @drive_stage/sessions/
+ camera_source::STRING,
+ duration_sec::INT
+ FROM @citydrive_stage/videos/
)
FILE_FORMAT = (TYPE = 'PARQUET');
```
-### Frame Events
+### `frame_events`
```sql
-COPY INTO frame_events (frame_id, session_id, frame_index, captured_at, event_type, risk_score)
+COPY INTO frame_events (frame_id, video_id, frame_index, collected_at, event_tag, risk_score, speed_kmh)
FROM (
SELECT frame_id::STRING,
- session_id::STRING,
+ video_id::STRING,
frame_index::INT,
- captured_at::TIMESTAMP,
- event_type::STRING,
- risk_score::DOUBLE
- FROM @drive_stage/frame-events/
+ collected_at::TIMESTAMP,
+ event_tag::STRING,
+ risk_score::DOUBLE,
+ speed_kmh::DOUBLE
+ FROM @citydrive_stage/frame-events/
)
FILE_FORMAT = (TYPE = 'PARQUET');
```
-### Detection Payloads
-payload 文件含嵌套列(`payload` 列为 JSON 对象)。用相同投影复制到 `frame_payloads` 表。
+### `frame_metadata_catalog`
+```sql
+COPY INTO frame_metadata_catalog (doc_id, meta_json, captured_at)
+FROM (
+ SELECT doc_id::STRING,
+ meta_json::VARIANT,
+ captured_at::TIMESTAMP
+ FROM @citydrive_stage/manifests/
+)
+FILE_FORMAT = (TYPE = 'PARQUET');
+```
+### `frame_embeddings`
```sql
-COPY INTO frame_payloads (frame_id, run_stage, payload, logged_at)
+COPY INTO frame_embeddings (frame_id, video_id, sensor_view, embedding, encoder_build, created_at)
FROM (
SELECT frame_id::STRING,
- run_stage::STRING,
- payload,
- logged_at::TIMESTAMP
- FROM @drive_stage/payloads/
+ video_id::STRING,
+ sensor_view::STRING,
+ embedding::VECTOR(768), -- 根据实际维度调整
+ encoder_build::STRING,
+ created_at::TIMESTAMP
+ FROM @citydrive_stage/frame-embeddings/
)
FILE_FORMAT = (TYPE = 'PARQUET');
```
-### Frame Embeddings
+### `frame_geo_points`
```sql
-COPY INTO frame_embeddings (frame_id, session_id, embedding, model_version, created_at)
+COPY INTO frame_geo_points (video_id, frame_id, position_wgs84, solution_grade, source_system, created_at)
FROM (
- SELECT frame_id::STRING,
- session_id::STRING,
- embedding::VECTOR(4), -- 将 4 替换为实际嵌入维度
- model_version::STRING,
+ SELECT video_id::STRING,
+ frame_id::STRING,
+ position_wgs84::GEOMETRY,
+ solution_grade::INT,
+ source_system::STRING,
+ created_at::TIMESTAMP
+ FROM @citydrive_stage/frame-locations/
+)
+FILE_FORMAT = (TYPE = 'PARQUET');
+```
+
+### `signal_contact_points`
+```sql
+COPY INTO signal_contact_points (node_id, signal_position, video_id, frame_id, frame_position, distance_m, created_at)
+FROM (
+ SELECT node_id::STRING,
+ signal_position::GEOMETRY,
+ video_id::STRING,
+ frame_id::STRING,
+ frame_position::GEOMETRY,
+ distance_m::DOUBLE,
created_at::TIMESTAMP
- FROM @drive_stage/embeddings/
+ FROM @citydrive_stage/traffic-lights/
)
FILE_FORMAT = (TYPE = 'PARQUET');
```
-下游所有指南(分析/搜索/向量/地理)均可看到本批次数据。
+完成后,SQL 分析、`QUERY()` 搜索、向量相似、地理过滤等所有负载都会读取完全相同的数据。
---
-## 4. Stream(可选)
-若希望下游作业在每次 `COPY INTO` 后感知新行,可在关键表(如 `frame_events`)上创建 Stream。用法参考[持续 Pipeline → Stream](/guides/load-data/continuous-data-pipelines/stream)。
+## 4. Streams(可选)
+想让下游作业只消费最近一次批量新增的数据?给目标表创建 Stream。
```sql
CREATE OR REPLACE STREAM frame_events_stream ON TABLE frame_events;
-SELECT * FROM frame_events_stream; -- 显示上次消费后的新行
+SELECT * FROM frame_events_stream; -- 查看刚 COPY 的新行
+-- …处理…
+SELECT * FROM frame_events_stream WITH CONSUME; -- 推进游标
```
-处理完毕后执行 `CONSUME STREAM frame_events_stream;`(或将行插入另一表)以推进偏移。
+`WITH CONSUME` 会在你处理完行后向前推进 offset。参考:[Streams](/guides/load-data/continuous-data-pipelines/stream)。
---
-## 5. Task(可选)
-Task 按调度执行**一条 SQL 语句**。可为每张表创建小 Task(或调用存储过程作为统一入口)。
+## 5. Tasks(可选)
+Task 会按计划运行**单条 SQL**。你可以为每张表建一个轻量 Task,或把逻辑写成存储过程后在 Task 中调用。
```sql
-CREATE OR REPLACE TASK task_load_sessions
+CREATE OR REPLACE TASK task_load_citydrive_videos
WAREHOUSE = 'default'
- SCHEDULE = 5 MINUTE
+ SCHEDULE = 10 MINUTE
AS
- COPY INTO drive_sessions (session_id, vehicle_id, route_name, start_time, end_time, weather, camera_setup)
+ COPY INTO citydrive_videos (video_id, vehicle_id, capture_date, route_name, weather, camera_source, duration_sec)
FROM (
- SELECT session_id::STRING,
+ SELECT video_id::STRING,
vehicle_id::STRING,
+ capture_date::DATE,
route_name::STRING,
- start_time::TIMESTAMP,
- end_time::TIMESTAMP,
weather::STRING,
- camera_setup::STRING
- FROM @drive_stage/sessions/
+ camera_source::STRING,
+ duration_sec::INT
+ FROM @citydrive_stage/videos/
)
FILE_FORMAT = (TYPE = 'PARQUET');
-ALTER TASK task_load_sessions RESUME;
+ALTER TASK task_load_citydrive_videos RESUME;
CREATE OR REPLACE TASK task_load_frame_events
WAREHOUSE = 'default'
- SCHEDULE = 5 MINUTE
+ SCHEDULE = 10 MINUTE
AS
- COPY INTO frame_events (frame_id, session_id, frame_index, captured_at, event_type, risk_score)
+ COPY INTO frame_events (frame_id, video_id, frame_index, collected_at, event_tag, risk_score, speed_kmh)
FROM (
SELECT frame_id::STRING,
- session_id::STRING,
+ video_id::STRING,
frame_index::INT,
- captured_at::TIMESTAMP,
- event_type::STRING,
- risk_score::DOUBLE
- FROM @drive_stage/frame-events/
+ collected_at::TIMESTAMP,
+ event_tag::STRING,
+ risk_score::DOUBLE,
+ speed_kmh::DOUBLE
+ FROM @citydrive_stage/frame-events/
)
FILE_FORMAT = (TYPE = 'PARQUET');
ALTER TASK task_load_frame_events RESUME;
-
--- 对 frame_payloads 与 frame_embeddings 重复即可
```
-cron 语法、依赖设置与错误处理见[持续 Pipeline → Task](/guides/load-data/continuous-data-pipelines/task)。
\ No newline at end of file
+其余表可以按同样模式新增 Task。更多调度/依赖选项见:[Tasks](/guides/load-data/continuous-data-pipelines/task)。
+
+---
+
+当这些作业运行后,“统一工作负载”系列里的每个指南都读取相同的 CityDrive 表——无需额外 ETL,也不需要重复存储。
diff --git a/docs/cn/guides/54-query/_category_.json b/docs/cn/guides/54-query/_category_.json
index eceb721ed2..40762446c3 100644
--- a/docs/cn/guides/54-query/_category_.json
+++ b/docs/cn/guides/54-query/_category_.json
@@ -1,3 +1,3 @@
{
- "label": "统一工作负载(Unified Workloads)"
-}
\ No newline at end of file
+ "label": "统一引擎场景"
+}
diff --git a/docs/cn/guides/54-query/index.md b/docs/cn/guides/54-query/index.md
index ab8959fdc9..035b9c9200 100644
--- a/docs/cn/guides/54-query/index.md
+++ b/docs/cn/guides/54-query/index.md
@@ -1,15 +1,15 @@
---
-title: 统一工作负载
+title: 统一引擎场景
---
-Databend 现已作为统一引擎,支持 SQL 分析、多模态搜索、向量相似度、地理空间分析及持续 ETL。本迷你系列以 **EverDrive 智能视觉** 场景为例(会话 ID 如 `SES-20240801-SEA01`,帧 ID 如 `FRAME-0001`),演示同一数据集如何在不跨系统复制的情况下流经所有工作负载。
+CityDrive Intelligence 会保存每一次行车记录:把整段视频拆成帧,并为每个 `video_id` 写入结构化元数据、JSON 清单、行为标签、向量特征以及 GPS 轨迹。下面这一组指南展示 Databend 如何把这些需求都跑在同一个数仓里,既不需要复制数据,也不用额外搭建搜索或向量集群。
-| 指南 | 涵盖内容 |
+| 指南 | 内容摘要 |
|-------|----------------|
-| [SQL 分析](./00-sql-analytics.md) | 构建共享表、切分会话、添加窗口/聚合加速 |
-| [JSON 与搜索](./01-json-search.md) | 存储检测负载并 `QUERY` 风险场景 |
-| [向量搜索](./02-vector-db.md) | 保留帧嵌入并查找语义邻居 |
-| [地理分析](./03-geo-analytics.md) | 使用 `HAVERSINE`、多边形、H3 映射事件 |
-| [湖仓 ETL](./04-lakehouse-etl.md) | 暂存文件、`COPY INTO` 表、可选流/任务 |
+| [SQL 分析](./00-sql-analytics.md) | 构建基础表,示范过滤、连接、窗口与聚合索引 |
+| [JSON 与搜索](./01-json-search.md) | 加载 `frame_metadata_catalog`,运行 Elasticsearch `QUERY()`,关联位图标签 |
+| [向量搜索](./02-vector-db.md) | 保留向量特征,用余弦距离做语义相似度检索,并联动风险指标 |
+| [地理分析](./03-geo-analytics.md) | 运用 `GEOMETRY`、距离/多边形过滤以及信号灯关联 |
+| [湖仓 ETL](./04-lakehouse-etl.md) | 一次暂存,`COPY INTO` 共享表,并可选配 Streams/Tasks |
-按顺序完成即可看到 Databend 的单个查询优化器(Query Optimizer)如何为同一车队数据上的分析、搜索、向量、地理及加载流水线提供支持。
\ No newline at end of file
+按顺序体验,即可看到同一批 CityDrive 标识符如何贯穿经典 SQL、全文检索、向量、地理和 ETL,全程由 Databend 的单一执行引擎托管。
diff --git a/docs/cn/sql-reference/20-sql-functions/10-search-functions/index.md b/docs/cn/sql-reference/20-sql-functions/10-search-functions/index.md
index ab14db315c..7b884a5769 100644
--- a/docs/cn/sql-reference/20-sql-functions/10-search-functions/index.md
+++ b/docs/cn/sql-reference/20-sql-functions/10-search-functions/index.md
@@ -23,7 +23,7 @@ CREATE OR REPLACE TABLE frames (
| 函数 | 描述 | 示例 |
|----------|-------------|---------|
| [MATCH](match) | 对指定列执行相关性排序搜索。 | `MATCH('summary, tags', 'traffic light red')` |
-| [QUERY](query) | 解析 Lucene 风格查询表达式,支持嵌套 `VARIANT` 字段。 | `QUERY('meta.signals.traffic_light:red')` |
+| [QUERY](query) | 解析 Elasticsearch 风格查询表达式,支持嵌套 `VARIANT` 字段。 | `QUERY('meta.signals.traffic_light:red')` |
| [SCORE](score) | 与 `MATCH` 或 `QUERY` 配合使用时,返回当前行的相关性得分。 | `SELECT summary, SCORE() FROM frame_notes WHERE MATCH('summary, tags', 'traffic light red')` |
## 查询语法示例
@@ -89,4 +89,4 @@ SELECT id, meta['frame']['timestamp'] AS ts, SCORE()
FROM frames
WHERE QUERY('meta.signals.traffic_light:red^1.0 AND meta.tags:urban^2.0')
LIMIT 100;
-```
\ No newline at end of file
+```
diff --git a/docs/cn/sql-reference/20-sql-functions/10-search-functions/query.md b/docs/cn/sql-reference/20-sql-functions/10-search-functions/query.md
index 76d86c9663..4cafd84057 100644
--- a/docs/cn/sql-reference/20-sql-functions/10-search-functions/query.md
+++ b/docs/cn/sql-reference/20-sql-functions/10-search-functions/query.md
@@ -5,7 +5,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
-`QUERY` 通过 Lucene 风格查询表达式与具备倒排索引(Inverted Index)的列进行匹配,从而过滤行。使用点记法可导航 `VARIANT` 列中的嵌套字段。该函数仅在 `WHERE` 子句中生效。
+`QUERY` 通过 Elasticsearch 风格查询表达式与具备倒排索引(Inverted Index)的列进行匹配,从而过滤行。使用点记法可导航 `VARIANT` 列中的嵌套字段。该函数仅在 `WHERE` 子句中生效。
:::info
Databend 的 QUERY 函数灵感源自 Elasticsearch 的 [QUERY](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-search.html#sql-functions-search-query)。
@@ -179,4 +179,4 @@ SELECT id, meta['frame']['timestamp'] AS ts
FROM frames
WHERE QUERY('meta.detections.text:SCHOOL AND meta.scene.time_of_day:day');
-- 返回 id 3
-```
\ No newline at end of file
+```
diff --git a/i18n/zh/code.json b/i18n/zh/code.json
index 9658929e8a..868e7d774c 100644
--- a/i18n/zh/code.json
+++ b/i18n/zh/code.json
@@ -27,7 +27,7 @@
"description": "The first paragraph of the 404 page"
},
"theme.NotFound.p2": {
- "message": "请联系原始链接来源网站的所有者,并告知他们链接已损坏。",
+ "message": "请联系原始链接来源网站的所有者,并告知他们链接已损坏。",
"description": "The 2nd paragraph of the 404 page"
},
"theme.admonition.note": {
@@ -138,7 +138,7 @@
"description": "The label used to tell the user that he's browsing an unreleased doc version"
},
"theme.docs.versions.unmaintainedVersionLabel": {
- "message": "此为 {siteTitle} {versionLabel} 版的文档,现已不再积极维护。",
+ "message": "此为 {siteTitle} {versionLabel} 版的文档,现已不再积极维护。",
"description": "The label used to tell the user that he's browsing an unmaintained doc version"
},
"theme.docs.versions.latestVersionSuggestionLabel": {
@@ -414,7 +414,7 @@
"description": "Thanks for voting!"
},
"Did this page help you?": {
- "message": "指出文档中的错误或问题,我们将会赠予您专属纪念 T 恤一件!",
+ "message": "指出文档中的错误或问题,我们将会赠予您专属纪念 T 恤一件!",
"description": "Did this page help you?"
},
"Explore Databend Cloud for FREE": {
@@ -470,7 +470,7 @@
"description": "Cloud Data Analytics"
},
"Databend - Your best alternative to Snowflake. Cost-effective and simple for massive-scale analytics.": {
- "message": "Databend - 替代 Snowflake 的最佳方案。高性价比且简单易用,适用于大规模数据分析。",
+ "message": "Databend - 替代 Snowflake 的最佳方案。高性价比且简单易用,适用于大规模数据分析。",
"description": "Databend - Your best alternative to Snowflake. Cost-effective and simple for massive-scale analytics."
},
"PAGE NOT FOUND": {
@@ -478,7 +478,7 @@
"description": "PAGE NOT FOUND"
},
"Please check your link or head Home to regroup.": {
- "message": "页面地址可能有所变更或者不存在,请检查您的链接或返回到操作指南。",
+ "message": "页面地址可能有所变更或者不存在,请检查您的链接或返回到操作指南。",
"description": "Either you're out of bounds or that page doesn't exist. Please check your link or head Home to regroup."
},
"BACK TO HOME": {
@@ -522,7 +522,7 @@
"description": "Databend Cloud 部分的描述"
},
"Connect to Databend": {
- "message": "连接到 Databend",
+ "message": "连接 Databend",
"description": "连接到 Databend 部分的标题"
},
"Developer Resources": {
@@ -530,11 +530,11 @@
"description": "开发者资源的链接文字"
},
"Connect your application to Databend in just a few minutes.": {
- "message": "几分钟内即可让您的应用连上 Databend。",
+ "message": "几分钟内就能让应用接入 Databend。",
"description": "连接到 Databend 部分的描述"
},
"Load Data into Databend": {
- "message": "加载数据到 Databend",
+ "message": "向 Databend 加载数据",
"description": "加载数据到 Databend 部分的标题"
},
"Know More": {
@@ -542,27 +542,27 @@
"description": "了解更多的链接文字"
},
"Bulk import data into Databend(Cloud) in multiple formats.": {
- "message": "支持多种格式批量导入数据至 Databend(Cloud)。",
+ "message": "以多种格式批量把数据导入 Databend(含 Cloud)。",
"description": "加载数据到 Databend 部分的描述"
},
"AI & BI & Visualization & Notebooks": {
- "message": "AI & BI & 可视化 & 笔记本",
+ "message": "AI · BI · 可视化 · Notebook",
"description": "AI & BI & 可视化 & 笔记本 部分的标题"
},
"All Tools": {
- "message": "所有工具",
+ "message": "全部工具",
"description": "所有工具的链接文字"
},
"Databend offers connectors and plugins for integrating with major data import tools, ensuring efficient data synchronization.": {
- "message": "Databend 提供丰富的连接器与插件,可与主流数据导入工具无缝集成,保障数据高效同步。",
+ "message": "Databend 提供主流导入工具的连接器与插件,保障高效同步。",
"description": "AI & BI & 可视化 & 笔记本 部分的描述"
},
"Continuous Data Pipelines": {
- "message": "连续数据管道",
+ "message": "持续数据管道",
"description": "连续数据管道部分的标题"
},
"Data pipelines automate the process of moving and changing data from different sources into Databend.": {
- "message": "数据管道可自动完成多源数据的迁移、转换与加载到 Databend。",
+ "message": "数据管道自动完成多源采集、转换并写入 Databend。",
"description": "连续数据管道部分的描述"
},
"Real-Time CDC Ingestion": {
@@ -574,11 +574,11 @@
"description": "自动化数据管道的文本"
},
"Additional Informations": {
- "message": "更多信息",
+ "message": "更多资料",
"description": "额外信息部分的标题"
},
"AI Capabilities": {
- "message": "AI 功能",
+ "message": "AI 能力",
"description": "AI 功能的文本"
},
"Databend Products": {
@@ -590,7 +590,7 @@
"description": "安全的文本"
},
"Contact Support": {
- "message": "联系客服",
+ "message": "联系支持团队",
"description": "联系支持的文本"
},
"Pricing": {
@@ -598,15 +598,15 @@
"description": "价格的文本"
},
"Use Cases": {
- "message": "用户案例",
+ "message": "典型场景",
"description": "Use Cases"
},
"Introduction to Databend Products": {
- "message": "Databend 产品介绍",
+ "message": "Databend 产品导览",
"description": "Databend 产品介绍部分的标题"
},
"Choose the deployment option that best fits your needs and scale.": {
- "message": "选择最契合业务需求的部署方式,随需扩展。",
+ "message": "按业务规模选择最合适的部署方式。",
"description": "Databend 产品介绍部分的描述"
},
"Databend Cloud": {
@@ -614,7 +614,7 @@
"description": "Databend Cloud 产品的标题"
},
"Fully-managed cloud service. No setup required.": {
- "message": "全托管云服务,开箱即用。",
+ "message": "全托管云服务,开箱即可使用。",
"description": "Databend Cloud 产品的描述"
},
"Databend Enterprise": {
@@ -622,7 +622,7 @@
"description": "Databend Enterprise 产品的标题"
},
"Self-hosted with enterprise features and support.": {
- "message": "自主部署,拥有企业级功能与专业支持。",
+ "message": "自主部署,配备企业级功能与支持。",
"description": "Databend Enterprise 产品的描述"
},
"Databend Community": {
@@ -630,15 +630,15 @@
"description": "Databend 社区版 产品的标题"
},
"Open-source and free for all use cases.": {
- "message": "开源免费,适用于任何场景。",
+ "message": "开源且永久免费。",
"description": "Databend 社区版 产品的描述"
},
"Getting Started": {
- "message": "入门指南",
+ "message": "快速入门",
"description": "入门指南部分的标题"
},
"Create a Databend Cloud account or deploy your own Databend instance.": {
- "message": "注册 Databend Cloud 账户或自主部署 Databend 实例。",
+ "message": "注册 Databend Cloud 或自行部署实例。",
"description": "入门指南部分的描述"
},
"Activate Databend Cloud": {
@@ -686,7 +686,7 @@
"description": "升级 Databend 的链接文字"
},
"Changelog": {
- "message": "发布记录",
+ "message": "更新日志",
"description": "Changelog"
},
"FAQ": {
@@ -694,7 +694,7 @@
"description": "FAQ"
},
"Product Features": {
- "message": "产品特点",
+ "message": "产品特性",
"description": "Product Features"
},
"Unified Engine": {
@@ -718,23 +718,23 @@
"description": "Stores all data in object storage."
},
"Analytics, vector, search, and geo share one optimizer and runtime.": {
- "message": "分析、向量、搜索、地理信息共享统一的查询优化器与执行引擎。",
+ "message": "分析、向量、搜索与地理能力共用一套优化器和执行引擎。",
"description": "Description for unified engine feature"
},
"Unified Data": {
- "message": "统一数据",
+ "message": "统一数据层",
"description": "Headline for unified data feature"
},
"Structured, semi-structured, unstructured, and vector data share object storage.": {
- "message": "结构化、半结构化、非结构化及向量数据统一存储于对象存储中。",
+ "message": "结构化、半结构化、非结构化与向量数据共享同一对象存储。",
"description": "Description for unified data feature"
},
"Analytics Native": {
- "message": "原生分析能力",
+ "message": "原生分析引擎",
"description": "Headline for analytics native feature"
},
"ANSI SQL, windowing, incremental aggregates, and streaming power BI.": {
- "message": "标准 SQL、窗口函数、增量聚合与流式计算为 BI 分析提供强力支撑。",
+ "message": "ANSI SQL、窗口函数、增量聚合与流式处理为 BI 持续供能。",
"description": "Description for analytics native feature"
},
"Vector Native": {
@@ -742,7 +742,7 @@
"description": "Headline for vector native feature"
},
"Embeddings, vector indexes, and semantic retrieval all run in SQL.": {
- "message": "向量嵌入、向量索引与语义检索均可通过 SQL 直接完成。",
+ "message": "向量嵌入、索引与语义检索全部在 SQL 中完成。",
"description": "Description for vector native feature"
},
"Search Native": {
@@ -750,27 +750,27 @@
"description": "Headline for search native feature"
},
"JSON inverted indexes, geo functions, and ranking fuel hybrid maps.": {
- "message": "JSON 全文索引、地理函数与排序算法共同驱动混合检索。",
+ "message": "JSON 倒排索引、地理函数与排序能力共同驱动混合检索。",
"description": "Description for search native feature"
},
"Unified Deployment": {
- "message": "统一部署方式",
+ "message": "统一部署选择",
"description": "Headline for unified deployment feature"
},
"Databend runs the same in Cloud, Docker, or `pip install`.": {
- "message": "无论云端、Docker 还是 `pip install`,都是同一个 Databend 内核。",
+ "message": "无论 Cloud、Docker 还是 `pip install`,体验的都是同一个 Databend 引擎。",
"description": "Description for unified deployment feature"
},
"Start with Databend Cloud": {
- "message": "注册 Databend Cloud",
+ "message": "从 Databend Cloud 起步",
"description": "Start with Databend Cloud"
},
"Get started in minutes with our fully-managed cloud service. No setup required.": {
- "message": "几分钟即可上手我们的全托管云服务,无需任何配置。",
+ "message": "几分钟即可启用全托管云服务,无需任何额外配置。",
"description": "Get started in minutes with our fully-managed cloud service. No setup required."
},
"What you need to know:": {
- "message": "您需要了解的内容:",
+ "message": "重点信息:",
"description": "What you need to know:"
},
"Choose Your Edition": {
@@ -778,7 +778,7 @@
"description": "Choose Your Edition"
},
"Pricing & Plans": {
- "message": "定价与计划",
+ "message": "价格与套餐",
"description": "Pricing & Plans"
},
"Using Databend Cloud": {
@@ -786,23 +786,23 @@
"description": "Using Databend Cloud"
},
"Deploy Your Own Instance": {
- "message": "部署您自己的实例",
+ "message": "自主部署实例",
"description": "Deploy Your Own Instance"
},
"Install Databend on your infrastructure for complete control and customization.": {
- "message": "部署在您自己的基础设施上,实现完全自主可控与深度定制。",
+ "message": "在自有基础设施上安装 Databend,配置完全可控。",
"description": "Install Databend on your infrastructure for complete control and customization."
},
"5-Minute Quick Start": {
- "message": "5 分钟快速开始",
+ "message": "5 分钟快速上手",
"description": "5-Minute Quick Start"
},
"Download & Install": {
- "message": "下载与安装",
+ "message": "下载并安装",
"description": "Download & Install"
},
"Enterprise Features & Licensing": {
- "message": "企业功能与许可",
+ "message": "企业特性与许可",
"description": "Enterprise Features & Licensing"
},
"Copy Page": {
@@ -810,7 +810,7 @@
"description": "Copy Page"
},
"Copy page as Markdown for LLMs": {
- "message": "复制为 Markdown 格式,供大语言模型使用",
+ "message": "复制为 Markdown 格式,供大语言模型使用",
"description": "Copy page as Markdown for LLMs"
},
"View as Markdown": {