Skip to content

Commit 197667e

Browse files
yzeng1618zengyiSbloodyS
authored
[Improvement-17994][Seatunnel] harden startupScript and -i args (#17996)
* [Improvement-17994][Seatunnel] harden startupScript and -i args * [Improvement-17994][Seatunnel] update docs * [Improvement-17994][Seatunnel] update docs to 2.3.3 * [Improvement-17994][Seatunnel] add UT --------- Co-authored-by: zengyi <zengyi@chinatelecom.cn> Co-authored-by: xiangzihao <460888207@qq.com>
1 parent 58e60fe commit 197667e

6 files changed

Lines changed: 85 additions & 13 deletions

File tree

docs/docs/en/guide/task/seatunnel.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Overview
44

5-
`SeaTunnel` task type for creating and executing `SeaTunnel` tasks. When the worker executes this task, it will parse the config file through the `start-seatunnel-spark.sh` , `start-seatunnel-flink.sh` or `seatunnel.sh` command.
5+
`SeaTunnel` task type for creating and executing `SeaTunnel` tasks. When the worker executes this task, it will parse and run the config file through the startup scripts under `${SEATUNNEL_HOME}/bin/` (such as `seatunnel.sh` / `start-seatunnel-*-connector-v2.sh`).
66
Click [here](https://seatunnel.apache.org/) for more information about `Apache SeaTunnel`.
77

88
## Create Task
@@ -16,7 +16,7 @@ Click [here](https://seatunnel.apache.org/) for more information about `Apache S
1616
[//]: # (- Please refer to [DolphinScheduler Task Parameters Appendix]&#40;appendix.md#default-task-parameters&#41; `Default Task Parameters` section for default parameters.)
1717

1818
- Please refer to [DolphinScheduler Task Parameters Appendix](appendix.md) `Default Task Parameters` section for default parameters.
19-
- Startup script: Select script name to start the task, including `seatunnel.sh`, `start-seatunnel-flink-13-connector-v2.sh`, `start-seatunnel-flink-15-connector-v2.sh`, `start-seatunnel-flink-connector-v2.sh`, `start-seatunnel-flink.sh`, `start-seatunnel-spark-2-connector-v2.sh`, `start-seatunnel-spark-3-connector-v2.sh`, `start-seatunnel-spark-connector-v2.sh`, `start-seatunnel-spark.sh`
19+
- Startup script: Select script name to start the task (it may vary across SeaTunnel distributions, please check `${SEATUNNEL_HOME}/bin/`), including `seatunnel.sh`, `start-seatunnel-flink-13-connector-v2.sh`, `start-seatunnel-flink-15-connector-v2.sh`, `start-seatunnel-flink-connector-v2.sh`, `start-seatunnel-flink.sh`, `start-seatunnel-spark-2-connector-v2.sh`, `start-seatunnel-spark-3-connector-v2.sh`, `start-seatunnel-spark-connector-v2.sh`, `start-seatunnel-spark.sh`
2020
- FLINK
2121
- Run model: supports `run` and `run-application` modes
2222
- Option parameters: used to add the parameters of the Flink engine, such as `-m yarn-cluster -ynm seatunnel`
@@ -82,7 +82,7 @@ sink {
8282

8383
### Support SeaTunnel Version
8484

85-
- v2.3.1
86-
- v2.3.2
87-
- v2.3.3
85+
- The examples in this doc are based on the `2.3.x` CLI options and startup scripts
86+
- Verified: v2.3.1, v2.3.2, v2.3.3
87+
- Other versions: this task type is essentially a wrapper of SeaTunnel CLI. Newer versions usually work as long as the startup scripts and CLI options are compatible (please run regression tests after upgrading).
8888

docs/docs/zh/guide/task/seatunnel.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## 综述
44

5-
`SeaTunnel` 任务类型,用于创建并执行 `SeaTunnel` 类型任务。worker 执行该任务的时候,会通过 `start-seatunnel-spark.sh``start-seatunnel-flink.sh` `seatunnel.sh` 命令解析 config 文件。
5+
`SeaTunnel` 任务类型,用于创建并执行 `SeaTunnel` 类型任务。worker 执行该任务的时候,会通过 `${SEATUNNEL_HOME}/bin/` 下的启动脚本(如 `seatunnel.sh` / `start-seatunnel-*-connector-v2.sh`)解析并执行 config 文件。
66
点击 [这里](https://seatunnel.apache.org/) 获取更多关于 `Apache SeaTunnel` 的信息。
77

88
## 创建任务
@@ -16,7 +16,7 @@
1616
[//]: # (- 默认参数说明请参考[DolphinScheduler任务参数附录]&#40;appendix.md#默认任务参数&#41;`默认任务参数`一栏。)
1717

1818
- 默认参数说明请参考[DolphinScheduler任务参数附录](appendix.md)`默认任务参数`一栏。
19-
- 启动脚本:选择你想要运行任务的启动脚本,包括 `seatunnel.sh`, `start-seatunnel-flink-13-connector-v2.sh`, `start-seatunnel-flink-15-connector-v2.sh`, `start-seatunnel-flink-connector-v2.sh`, `start-seatunnel-flink.sh`, `start-seatunnel-spark-2-connector-v2.sh`, `start-seatunnel-spark-3-connector-v2.sh`, `start-seatunnel-spark-connector-v2.sh`, `start-seatunnel-spark.sh`
19+
- 启动脚本:选择你想要运行任务的启动脚本(不同 SeaTunnel 发行包可能存在差异,以实际 `${SEATUNNEL_HOME}/bin/` 为准),包括 `seatunnel.sh`, `start-seatunnel-flink-13-connector-v2.sh`, `start-seatunnel-flink-15-connector-v2.sh`, `start-seatunnel-flink-connector-v2.sh`, `start-seatunnel-flink.sh`, `start-seatunnel-spark-2-connector-v2.sh`, `start-seatunnel-spark-3-connector-v2.sh`, `start-seatunnel-spark-connector-v2.sh`, `start-seatunnel-spark.sh`
2020
- FLINK
2121
- 运行模型:支持 `run``run-application` 两种模式
2222
- 选项参数:用于添加 Flink 引擎本身参数,例如 `-m yarn-cluster -ynm seatunnel`
@@ -82,7 +82,7 @@ sink {
8282

8383
### 支持 SeaTunnel 版本
8484

85-
- 2.3.1
86-
- 2.3.2
87-
- 2.3.3
85+
- 文档示例基于 `2.3.x` 版本的命令行参数与启动脚本
86+
- 已验证:2.3.1、2.3.2、2.3.3
87+
- 其他版本:该任务类型本质是对 SeaTunnel CLI 的封装,如 SeaTunnel 启动脚本与命令行参数保持兼容,通常可直接使用更高版本(建议升级后先做回归验证)
8888

dolphinscheduler-task-plugin/dolphinscheduler-task-seatunnel/src/main/java/org/apache/dolphinscheduler/plugin/task/seatunnel/SeatunnelParameters.java

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
import org.apache.commons.lang3.StringUtils;
2626

2727
import java.util.List;
28-
import java.util.Objects;
28+
import java.util.regex.Pattern;
2929

3030
import lombok.Getter;
3131
import lombok.NoArgsConstructor;
@@ -36,6 +36,8 @@
3636
@NoArgsConstructor
3737
public class SeatunnelParameters extends AbstractParameters {
3838

39+
private static final Pattern STARTUP_SCRIPT_PATTERN = Pattern.compile("^[A-Za-z0-9][A-Za-z0-9._-]*\\.sh$");
40+
3941
private String startupScript;
4042

4143
private Boolean useCustom;
@@ -49,12 +51,16 @@ public class SeatunnelParameters extends AbstractParameters {
4951

5052
@Override
5153
public boolean checkParameters() {
52-
return Objects.nonNull(startupScript)
54+
return isValidStartupScript(startupScript)
5355
&& ((BooleanUtils.isTrue(useCustom) && StringUtils.isNotBlank(rawScript))
5456
|| (BooleanUtils.isFalse(useCustom) && CollectionUtils.isNotEmpty(resourceList)
5557
&& resourceList.size() == 1));
5658
}
5759

60+
private static boolean isValidStartupScript(String startupScript) {
61+
return StringUtils.isNotBlank(startupScript) && STARTUP_SCRIPT_PATTERN.matcher(startupScript).matches();
62+
}
63+
5864
@Override
5965
public List<ResourceInfo> getResourceFilesList() {
6066
return resourceList;

dolphinscheduler-task-plugin/dolphinscheduler-task-seatunnel/src/main/java/org/apache/dolphinscheduler/plugin/task/seatunnel/SeatunnelTask.java

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -178,11 +178,19 @@ private List<String> generateTaskParameters() {
178178
List<String> parameters = new ArrayList<>();
179179
variables.forEach((k, v) -> {
180180
parameters.add("-i");
181-
parameters.add(String.format("%s='%s'", k, v));
181+
parameters.add(String.format("%s=%s", k, quoteForBash(v)));
182182
});
183183
return parameters;
184184
}
185185

186+
private static String quoteForBash(String value) {
187+
if (value == null) {
188+
return "''";
189+
}
190+
// Escape single quotes in a bash-safe way: abc'def -> 'abc'"'"'def'
191+
return "'" + value.replace("'", "'\"'\"'") + "'";
192+
}
193+
186194
private String buildCustomConfigContent() {
187195
log.info("raw custom config content : {}", seatunnelParameters.getRawScript());
188196
String script = seatunnelParameters.getRawScript().replaceAll("\\r\\n", System.lineSeparator());
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.dolphinscheduler.plugin.task.seatunnel;
19+
20+
import org.junit.jupiter.api.Assertions;
21+
import org.junit.jupiter.api.Test;
22+
23+
public class SeatunnelParametersTest {
24+
25+
@Test
26+
public void testInvalidStartupScript() {
27+
SeatunnelParameters seatunnelParameters = new SeatunnelParameters();
28+
seatunnelParameters.setUseCustom(true);
29+
seatunnelParameters.setRawScript("env { execution.parallelism = 1 }");
30+
31+
seatunnelParameters.setStartupScript("../../../etc/passwd");
32+
Assertions.assertFalse(seatunnelParameters.checkParameters());
33+
34+
seatunnelParameters.setStartupScript("script.sh; rm -rf /");
35+
Assertions.assertFalse(seatunnelParameters.checkParameters());
36+
37+
seatunnelParameters.setStartupScript("script");
38+
Assertions.assertFalse(seatunnelParameters.checkParameters());
39+
40+
seatunnelParameters.setStartupScript(".sh");
41+
Assertions.assertFalse(seatunnelParameters.checkParameters());
42+
}
43+
}

dolphinscheduler-task-plugin/dolphinscheduler-task-seatunnel/src/test/java/org/apache/dolphinscheduler/plugin/task/seatunnel/SeatunnelTaskTest.java

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727

2828
import org.apache.commons.io.FileUtils;
2929

30+
import java.lang.reflect.Method;
3031
import java.util.ArrayList;
3132
import java.util.Collections;
3233
import java.util.List;
@@ -133,6 +134,20 @@ public void testParameterPass() throws Exception {
133134
Assertions.assertEquals(expectedCommand, command);
134135
}
135136

137+
@Test
138+
public void testQuoteForBash() throws Exception {
139+
Assertions.assertEquals("'value'", invokeQuoteForBash("value"));
140+
Assertions.assertEquals("'abc'\"'\"'def'", invokeQuoteForBash("abc'def"));
141+
Assertions.assertEquals("''", invokeQuoteForBash(null));
142+
Assertions.assertEquals("'$(rm -rf /)'", invokeQuoteForBash("$(rm -rf /)"));
143+
}
144+
145+
private String invokeQuoteForBash(String value) throws Exception {
146+
Method quoteForBash = SeatunnelTask.class.getDeclaredMethod("quoteForBash", String.class);
147+
quoteForBash.setAccessible(true);
148+
return (String) quoteForBash.invoke(null, value);
149+
}
150+
136151
private static final String RAW_SCRIPT = "env {\n" +
137152
" execution.parallelism = 2\n" +
138153
" job.mode = \"BATCH\"\n" +

0 commit comments

Comments
 (0)