Skip to content

Commit b36ef98

Browse files
authored
vcspull add - Refactoring and fixes for duplicate workspace roots (#480)
why: - align CLI defaults with duplicate-aware config handling so repositories are never dropped during add/discover/fmt workflows. what: - route load_configs and CLI entry points through the duplicate-aware loader - add a path-first vcspull add flow with confirmation prompts and --yes override - refresh README/CLI docs and drop the obsolete cli/import page
2 parents 9de7877 + cf8813f commit b36ef98

File tree

19 files changed

+1338
-235
lines changed

19 files changed

+1338
-235
lines changed

CHANGES

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,17 +37,42 @@ _Upcoming changes will be written here._
3737

3838
#### `vcspull fmt` gains duplicate root merging (#479)
3939

40-
- Detects repeated workspace root labels and merges their repositories during
41-
formatting so users keep a single normalized section.
42-
- Adds `--no-merge` for workflows that need to preserve duplicate entries while
43-
still seeing diagnostic warnings.
40+
- Running `vcspull fmt` now consolidates repeated workspace sections into one
41+
merged block so no repositories vanish during cleanup; pass `--no-merge` if
42+
you want the command to surface duplicates without rewriting them.
43+
44+
#### `vcspull add` protects workspace configs during imports (#480)
45+
46+
- Default behavior now merges duplicate workspace roots so prior entries stay
47+
intact; add `--no-merge` to keep the raw sections and handle them yourself.
48+
- You can invoke `vcspull add ~/study/python/project`; the command inspects the
49+
path, auto-fills the `origin` remote, shows the tilde-shortened workspace, and
50+
asks for confirmation unless you supply `--yes`.
51+
- CLI previews now contract `$HOME` to `~/…`, matching the rest of the UX.
52+
53+
#### `vcspull discover` honors --no-merge (#480)
54+
55+
- Running `vcspull discover --no-merge` only reports duplicates—it leaves the
56+
file untouched until you decide to act.
57+
58+
#### Configuration loader: Support for duplicate workspace roots (#480)
59+
60+
- Commands backed by `load_configs` (`list`, `status`, `sync`, etc.)
61+
automatically keep every repository even when workspace sections repeat; pass
62+
`merge_duplicates=False` to fall back to the legacy "last entry wins"
63+
behavior with a warning.
4464

4565
### Development
4666

4767
#### Snapshot coverage for formatter tests (#479)
4868

49-
- Formatter scenarios now use [Syrupy]-backed JSON and YAML snapshots to guard
50-
against regressions in duplicate workspace-root merging.
69+
- Formatter scenarios are checked against [Syrupy] JSON/YAML snapshots, making it
70+
obvious when future changes alter the merged output or log text.
71+
72+
#### CLI `add` path-mode regression tests (#480)
73+
74+
- Parameterized pytest scenarios cover interactive prompts, duplicate merging,
75+
and path inference to keep the redesigned workflow stable.
5176

5277
[syrupy]: https://github.com/syrupy-project/syrupy
5378

CLAUDE.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,6 @@ Follow this workflow for code changes:
103103
- `cli/__init__.py`: Main CLI entry point with argument parsing
104104
- `cli/sync.py`: Repository synchronization functionality
105105
- `cli/add.py`: Adding new repositories to configuration
106-
- `cli/add_from_fs.py`: Scanning filesystem for repositories
107106

108107
3. **Repository Management**
109108
- Uses `libvcs` package for VCS operations (git, svn, hg)
@@ -247,4 +246,4 @@ When stuck in debugging loops:
247246
1. **Pause and acknowledge the loop**
248247
2. **Minimize to MVP**: Remove all debugging cruft and experimental code
249248
3. **Document the issue** comprehensively for a fresh approach
250-
4. Format for portability (using quadruple backticks)
249+
4. Format for portability (using quadruple backticks)

README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,10 @@ $ vcspull add my-lib https://github.com/example/my-lib.git --path ~/code/my-lib
104104
`-w ~/projects/libs`.
105105
- Pass `-f/--file` to add to an alternate YAML file.
106106
- Use `--dry-run` to preview changes before writing.
107+
- Point at an existing checkout (`vcspull add ~/projects/example`) to infer the
108+
name and remote; add `--yes` to skip the confirmation prompt.
109+
- Append `--no-merge` if you prefer to review duplicate workspace roots
110+
yourself instead of having vcspull merge them automatically.
107111
- Follow with `vcspull sync my-lib` to clone or update the working tree after registration.
108112

109113
### Discover local checkouts and add en masse
@@ -116,8 +120,9 @@ $ vcspull discover ~/code --recursive
116120
```
117121

118122
The scan shows each repository before import unless you opt into `--yes`. Add
119-
`-w ~/code/` to pin the resulting workspace root or `-f` to
120-
write somewhere other than the default `~/.vcspull.yaml`.
123+
`-w ~/code/` to pin the resulting workspace root or `-f` to write somewhere other
124+
than the default `~/.vcspull.yaml`. Duplicate workspace roots are merged by
125+
default; include `--no-merge` to keep them separate while you review the log.
121126

122127
### Inspect configured repositories
123128

@@ -148,15 +153,16 @@ command for automation workflows.
148153

149154
### Normalize configuration files
150155

151-
After importing or editing by hand, run the formatter to tidy up keys and keep
152-
entries sorted:
156+
After importing or editing by hand, run the formatter to tidy up keys, merge
157+
duplicate workspace sections, and keep entries sorted:
153158

154159
```console
155160
$ vcspull fmt -f ~/.vcspull.yaml --write
156161
```
157162

158163
Use `vcspull fmt --all --write` to format every YAML file that vcspull can
159-
discover under the standard config locations.
164+
discover under the standard config locations. Add `--no-merge` if you only want
165+
duplicate roots reported, not rewritten.
160166

161167
## Sync your repos
162168

docs/cli/add.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,11 @@ The `--path` flag is useful when:
6363
- Using non-standard directory layouts
6464
- The repository name doesn't match the desired directory name
6565

66+
You can also point `vcspull add` at an existing checkout. Supplying a path such
67+
as `vcspull add ~/projects/example` infers the repository name, inspects its
68+
`origin` remote, and prompts before writing. Add `--yes` when you need to skip
69+
the confirmation in scripts.
70+
6671
## Choosing configuration files
6772

6873
By default, vcspull looks for the first YAML configuration file in:
@@ -142,14 +147,10 @@ $ vcspull add tooling https://github.com/company/tooling.git \
142147

143148
## Handling duplicates
144149

145-
If a repository with the same name already exists in the workspace, vcspull will warn you:
146-
147-
```console
148-
$ vcspull add flask https://github.com/pallets/flask.git -w ~/code/
149-
WARNING: Repository 'flask' already exists in workspace '~/code/'.
150-
```
151-
152-
The existing entry is preserved and not overwritten.
150+
vcspull merges duplicate workspace sections by default so existing repositories
151+
stay intact. When conflicts appear, the command logs what it kept. Prefer to
152+
resolve duplicates yourself? Pass `--no-merge` to leave every section untouched
153+
while still surfacing warnings.
153154

154155
## After adding repositories
155156

docs/cli/discover.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,10 @@ Scan complete: 2 repositories added, 0 skipped
4545
```
4646

4747
The command prompts for each repository before adding it to your configuration.
48+
When a matching workspace section already exists, vcspull merges the new entry
49+
into it so previously tracked repositories stay intact. Prefer to review
50+
duplicates yourself? Add `--no-merge` to keep every section untouched while
51+
still seeing a warning.
4852

4953
## Recursive scanning
5054

docs/cli/fmt.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,11 @@
66
entries stay consistent. By default the formatter prints the proposed changes to
77
stdout. Apply the updates in place with `--write`.
88

9+
When duplicate workspace roots are encountered, the formatter merges them into a
10+
single section so repositories are never dropped. Prefer to review duplicates
11+
without rewriting them? Pass `--no-merge` to leave the original sections in
12+
place while still showing a warning.
13+
914
## Command
1015

1116
```{eval-rst}
@@ -19,12 +24,14 @@ stdout. Apply the updates in place with `--write`.
1924

2025
## What gets formatted
2126

22-
The formatter performs three main tasks:
27+
The formatter performs four main tasks:
2328

2429
- Expands string-only entries into verbose dictionaries using the `repo` key.
2530
- Converts legacy `url` keys to `repo` for consistency with the rest of the
2631
tooling.
2732
- Sorts directory keys and repository names alphabetically to minimize diffs.
33+
- Consolidates duplicate workspace roots into a single merged section while
34+
logging any conflicts.
2835

2936
For example:
3037

src/vcspull/_internal/config_reader.py

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from __future__ import annotations
22

3+
import copy
34
import json
45
import pathlib
56
import typing as t
@@ -106,6 +107,20 @@ def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]:
106107
assert isinstance(path, pathlib.Path)
107108
content = path.open(encoding="utf-8").read()
108109

110+
# TODO(#?): Align this loader with the duplicate-aware YAML handling that
111+
# ``vcspull fmt`` introduced in November 2025. The formatter now uses a
112+
# custom SafeLoader subclass to retain and merge duplicate workspace root
113+
# sections so repos are never overwritten. ConfigReader currently drops
114+
# later duplicates because PyYAML keeps only the last key. Options:
115+
# 1) Extract the formatter's loader/merge helpers into a shared utility
116+
# that ConfigReader can reuse here;
117+
# 2) Replace ConfigReader entirely when reading vcspull configs and call
118+
# the formatter helpers directly;
119+
# 3) Keep this basic loader but add an opt-in path for duplicate-aware
120+
# parsing so commands like ``vcspull add`` can avoid data loss.
121+
# Revisit once the new ``vcspull add`` flow lands so both commands share
122+
# the same duplication safeguards.
123+
109124
if path.suffix in {".yaml", ".yml"}:
110125
fmt: FormatLiteral = "yaml"
111126
elif path.suffix == ".json":
@@ -204,3 +219,120 @@ def dump(self, fmt: FormatLiteral, indent: int = 2, **kwargs: t.Any) -> str:
204219
indent=indent,
205220
**kwargs,
206221
)
222+
223+
224+
class _DuplicateTrackingSafeLoader(yaml.SafeLoader):
225+
"""SafeLoader that records duplicate top-level keys."""
226+
227+
def __init__(self, stream: str) -> None:
228+
super().__init__(stream)
229+
self.top_level_key_values: dict[t.Any, list[t.Any]] = {}
230+
self._mapping_depth = 0
231+
232+
233+
def _duplicate_tracking_construct_mapping(
234+
loader: _DuplicateTrackingSafeLoader,
235+
node: yaml.nodes.MappingNode,
236+
deep: bool = False,
237+
) -> dict[t.Any, t.Any]:
238+
loader._mapping_depth += 1
239+
loader.flatten_mapping(node)
240+
mapping: dict[t.Any, t.Any] = {}
241+
242+
for key_node, value_node in node.value:
243+
construct = t.cast(
244+
t.Callable[[yaml.nodes.Node], t.Any],
245+
loader.construct_object,
246+
)
247+
key = construct(key_node)
248+
value = construct(value_node)
249+
250+
if loader._mapping_depth == 1:
251+
loader.top_level_key_values.setdefault(key, []).append(copy.deepcopy(value))
252+
253+
mapping[key] = value
254+
255+
loader._mapping_depth -= 1
256+
return mapping
257+
258+
259+
_DuplicateTrackingSafeLoader.add_constructor(
260+
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
261+
_duplicate_tracking_construct_mapping,
262+
)
263+
264+
265+
class DuplicateAwareConfigReader(ConfigReader):
266+
"""ConfigReader that tracks duplicate top-level YAML sections."""
267+
268+
def __init__(
269+
self,
270+
content: RawConfigData,
271+
*,
272+
duplicate_sections: dict[str, list[t.Any]] | None = None,
273+
) -> None:
274+
super().__init__(content)
275+
self._duplicate_sections = duplicate_sections or {}
276+
277+
@property
278+
def duplicate_sections(self) -> dict[str, list[t.Any]]:
279+
"""Mapping of top-level keys to the list of duplicated values."""
280+
return self._duplicate_sections
281+
282+
@classmethod
283+
def _load_yaml_with_duplicates(
284+
cls,
285+
content: str,
286+
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
287+
loader = _DuplicateTrackingSafeLoader(content)
288+
289+
try:
290+
data = loader.get_single_data()
291+
finally:
292+
dispose = t.cast(t.Callable[[], None], loader.dispose)
293+
dispose()
294+
295+
if data is None:
296+
loaded: dict[str, t.Any] = {}
297+
else:
298+
if not isinstance(data, dict):
299+
msg = "Loaded configuration is not a mapping"
300+
raise TypeError(msg)
301+
loaded = t.cast("dict[str, t.Any]", data)
302+
303+
duplicate_sections = {
304+
t.cast(str, key): values
305+
for key, values in loader.top_level_key_values.items()
306+
if len(values) > 1
307+
}
308+
309+
return loaded, duplicate_sections
310+
311+
@classmethod
312+
def _load_from_path(
313+
cls,
314+
path: pathlib.Path,
315+
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
316+
if path.suffix.lower() in {".yaml", ".yml"}:
317+
content = path.read_text(encoding="utf-8")
318+
return cls._load_yaml_with_duplicates(content)
319+
320+
return ConfigReader._from_file(path), {}
321+
322+
@classmethod
323+
def from_file(cls, path: pathlib.Path) -> DuplicateAwareConfigReader:
324+
content, duplicate_sections = cls._load_from_path(path)
325+
return cls(content, duplicate_sections=duplicate_sections)
326+
327+
@classmethod
328+
def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]:
329+
content, _ = cls._load_from_path(path)
330+
return content
331+
332+
@classmethod
333+
def load_with_duplicates(
334+
cls,
335+
path: pathlib.Path,
336+
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
337+
reader = cls.from_file(path)
338+
return reader.content, reader.duplicate_sections

src/vcspull/cli/__init__.py

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
from vcspull.log import setup_logger
1616

1717
from ._formatter import VcspullHelpFormatter
18-
from .add import add_repo, create_add_subparser
18+
from .add import add_repo, create_add_subparser, handle_add_command
1919
from .discover import create_discover_subparser, discover_repos
2020
from .fmt import create_fmt_subparser, format_config_file
2121
from .list import create_list_subparser, list_repos
@@ -368,14 +368,7 @@ def cli(_args: list[str] | None = None) -> None:
368368
max_concurrent=getattr(args, "max_concurrent", None),
369369
)
370370
elif args.subparser_name == "add":
371-
add_repo(
372-
name=args.name,
373-
url=args.url,
374-
config_file_path_str=args.config,
375-
path=args.path,
376-
workspace_root_path=args.workspace_root_path,
377-
dry_run=args.dry_run,
378-
)
371+
handle_add_command(args)
379372
elif args.subparser_name == "discover":
380373
discover_repos(
381374
scan_dir_str=args.scan_dir,
@@ -384,6 +377,7 @@ def cli(_args: list[str] | None = None) -> None:
384377
workspace_root_override=args.workspace_root_path,
385378
yes=args.yes,
386379
dry_run=args.dry_run,
380+
merge_duplicates=args.merge_duplicates,
387381
)
388382
elif args.subparser_name == "fmt":
389383
format_config_file(

0 commit comments

Comments
 (0)