refactor: initializers module #91

VITYANA · 2025-11-03T18:32:33Z

No description provided.

…ter_fit and _validate_clusters_distributions

iraedeus · 2025-11-03T18:58:22Z

rework_pysatl_mpest/initializers/__init__.py

In examples we use imports from modules, not files inside module. That's why there is __init__.py files

iraedeus · 2025-11-03T19:05:43Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

 from rework_pysatl_mpest.optimizers import Optimizer, ScipyNelderMead


+def _validate_clusters_distributions(


Missed docstrings

iraedeus · 2025-11-03T19:05:51Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+    return valid_clusters, cluster_weights
+
+
+def _calculate_cluster_fit(


Missed docstrings

iraedeus · 2025-11-03T19:09:03Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+    H_k: np.ndarray,
+    optimizer: Optimizer,
+) -> tuple[dict[str, float], float]:
+    new_params = estimation_func(temp_model, X, H_k, optimizer)


Suggested change

new_params = estimation_func(temp_model, X, H_k, optimizer)

param_names, param_values = estimation_func(temp_model, X, H_k, optimizer).items()

iraedeus · 2025-11-03T19:15:05Z

rework_pysatl_mpest/initializers/clusterize_initializer.py

+            X = X.reshape(-1, 1)
        self.models = dists
        self.n_components = len(dists)
        H = self._clusterize(X, self.clusterizer)


Clusterizer is available from the internal state of the object, why it is in the signature of the method self._clusterize??

iraedeus · 2025-11-03T19:22:33Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+    temp_model.set_params_from_vector(param_names, param_values)
+
+    log_probs = np.clip(temp_model.lpdf(X), -1e9, -1e-9)
+    weighted_log_likelihood = np.sum(H_k * log_probs)


Suggested change

weighted_log_likelihood = np.sum(H_k * log_probs)

weighted_log_likelihood = np.dot(H_k, log_probs)

iraedeus · 2025-11-03T19:41:27Z

rework_pysatl_mpest/initializers/clusterize_initializer.py

        5. Returns the initialized mixture model
        """
        X = np.asarray(X, dtype=np.float64)
+        if X.ndim == 1:


Move this inside self._clusterize(...)
Otherwise, your two-dimensional array X will be passed to the rest of the class's internal methods, which seem not to be suited for this.

iraedeus · 2025-11-03T19:48:04Z

rework_tests/unit/test_initializers/test_clusterize_initializer.py

        initializer = ClusterizeInitializer(is_accurate=True, is_soft=True, clusterizer=self.mock_clusterizer)

-        X = np.array([1.0, 2.0, 3.0])
+        X = np.array([[1.0], [2.0], [3.0]])


For what?
This test is not intended to check various X formats.

Besides, why this particular test? It seems to me that tests should be written in the format the user will work with, assuming a one-dimensional array will be passed.

iraedeus · 2025-11-03T19:49:32Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+            )

            effective_n = cluster_weights[k]
            score = weighted_log_likelihood / effective_n


Why is there an additional division here?

iraedeus · 2025-11-03T19:59:52Z

rework_pysatl_mpest/initializers/clusterize_initializer.py

Refactor the constructor and the perform signature:
The user should pass the default optimizer and clusterer to the constructor, and pass them to perform if they want to use other ones.

It is necessary to add a check that the clusterer has the required methods in the constructor and in perform, since the object type is Any

…eparated into greedy, hungarian, permutation methods, also for scoring functions aic and likelihood feat: Enums for matching methods, scoring methods tests: tests that include cluster_match_strategy commented/deleted

xImoZA · 2025-11-26T11:05:03Z

rework_pysatl_mpest/initializers/initializer.py

        X: ArrayLike,
        dists: list[ContinuousDistribution],
-        cluster_match_info: ClusterMatchStrategy,
+        cluster_match_info: MatchingMethod,


forgot to update the type hint in the docs

xImoZA · 2025-11-26T11:14:37Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+    for model, est_func in zip(models, estimation_strategies):
+        row: list[FitResult] = []
+        for k in valid_clusters:
+            cache_key = (model.__class__, est_func, k)


Idk, but what happens if models contains multiple instances of the same class? It seems like they would share the same cache key.

xImoZA · 2025-11-26T11:23:54Z

rework_pysatl_mpest/initializers/clusterize_initializer.py

+        score_func: ScoringMethod,
        estimation_strategies: list[EstimationStrategy],
        optimizer: Optimizer = ScipyNelderMead(),
    ) -> MixtureModel:


The argument names here do not match the abstract base class definition in Initializer

xImoZA · 2025-11-26T12:33:20Z

rework_pysatl_mpest/initializers/cluster_match_strategy.py

+    model_weights: list[float] = []
+    used_clusters = set()
+
+    for model, estimation_func in zip(context["models"], context["estimation_strategies"]):


The greedy strategy currently iterates sequentially. Was this order-dependency intended?

VITYANA added 2 commits November 3, 2025 20:47

refactor: moved duplicated code in separate functions _calculate_clus…

b7d2a82

…ter_fit and _validate_clusters_distributions

fix: add reshape for 1D distributions, small fix example in __init__

17bb5eb

VITYANA requested a review from iraedeus November 3, 2025 18:32

VITYANA self-assigned this Nov 3, 2025

VITYANA added the enhancement New feature or request label Nov 3, 2025

This was linked to issues Nov 3, 2025

REFACTOR: Initializers module #81

Open

FEAT: add Hungarian in cluster matching strategy #82

Open

REFACTOR: change API in Initializers module #83

Open

FEAT: rework the brute force strategy into cluster matching strategies #84

Open

VITYANA changed the title ~~Refactor: initializers refactor~~ refactor: initializers module Nov 3, 2025

VITYANA added 2 commits November 3, 2025 22:30

fix: add full path to html building

f5c6cc2

revert: "fix: add full path to html building"

32a2535

iraedeus requested changes Nov 3, 2025

View reviewed changes

xImoZA reviewed Nov 26, 2025

View reviewed changes

		from rework_pysatl_mpest.optimizers import Optimizer, ScipyNelderMead


		def _validate_clusters_distributions(

		return valid_clusters, cluster_weights


		def _calculate_cluster_fit(

	new_params = estimation_func(temp_model, X, H_k, optimizer)
	param_names, param_values = estimation_func(temp_model, X, H_k, optimizer).items()

	weighted_log_likelihood = np.sum(H_k * log_probs)
	weighted_log_likelihood = np.dot(H_k, log_probs)

refactor: initializers module #91

Are you sure you want to change the base?

refactor: initializers module #91

Uh oh!

Conversation

VITYANA commented Nov 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants