Skip to content

Commit 4aa94fb

Browse files
sryzahuangxiaopingRD
authored andcommitted
[SPARK-54435][SDP] spark-pipelines init should avoid overwriting existing directory
### What changes were proposed in this pull request? If the name provided to `spark-pipelines init` matches an existing directory, raises an error instead of overwriting the existing directory's contents. ### Why are the changes needed? Help users avoid accidentally overwriting their code. ### Does this PR introduce _any_ user-facing change? Yes, to an unreleased version. ### How was this patch tested? Added unit test for the cases where init is invoked with an existing directory. ### Was this patch authored or co-authored using generative AI tooling? Closes apache#53140 from sryza/initoverwrite. Authored-by: Sandy Ryza <sandy.ryza@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent b95ab64 commit 4aa94fb

File tree

2 files changed

+20
-0
lines changed

2 files changed

+20
-0
lines changed

python/pyspark/pipelines/init_cli.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,11 @@ def example_python_materialized_view() -> DataFrame:
4444
def init(name: str) -> None:
4545
"""Generates a simple pipeline project."""
4646
project_dir = Path.cwd() / name
47+
if project_dir.exists():
48+
raise FileExistsError(
49+
f"Directory '{name}' already exists. "
50+
"Please choose a different name or remove the existing directory."
51+
)
4752
project_dir.mkdir(parents=True, exist_ok=False)
4853

4954
# Create the storage directory

python/pyspark/pipelines/tests/test_init_cli.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,21 @@ def test_init(self):
7272
Path("transformations") / "example_sql_materialized_view.sql",
7373
)
7474

75+
def test_init_existing_directory(self):
76+
with tempfile.TemporaryDirectory() as temp_dir:
77+
project_name = "test_project"
78+
with change_dir(Path(temp_dir)):
79+
init(project_name)
80+
81+
with self.assertRaises(FileExistsError) as context:
82+
init(project_name)
83+
84+
expected_message = (
85+
f"Directory '{project_name}' already exists. "
86+
"Please choose a different name or remove the existing directory."
87+
)
88+
self.assertEqual(str(context.exception), expected_message)
89+
7590

7691
if __name__ == "__main__":
7792
try:

0 commit comments

Comments
 (0)