Families/normal family configuration #58

MyznikovFD · 2025-12-01T08:07:04Z

Implement Normal distribution with meanVar, meanPrec, exponential parameterizations
Add analytical implementations for PDF, CDF, PPF, characteristic function
Implement moment calculations (mean, variance, skewness, kurtosis)
Configure family registry and parametrization system"
Add 16 tests covering all distribution characteristics
Test multiple parameterizations and conversions
Verify against scipy.stats.norm for correctness"

- Implement Normal distribution with meanVar, meanPrec, exponential parameterizations - Add analytical implementations for PDF, CDF, PPF, characteristic function - Implement moment calculations (mean, variance, skewness, kurtosis) - Configure family registry and parametrization system

- Add 16 tests covering all distribution characteristics - Test multiple parameterizations and conversions - Verify against scipy.stats.norm for correctness

LeonidElkin

In general, it's very cool, but there are many minor flaws.

LeonidElkin · 2025-12-01T11:03:29Z

src/pysatl_core/families/config.py

+import numpy.typing as npt
+from scipy.special import erf, erfinv
+
+from pysatl_core.distributions import DefaultSamplingUnivariateStrategy


let's use full paths inside our library. Prefer pysatl_core.distributions.strategies instead of pysatl_core.distributions. That's because of the __init__.py inside distributions module

LeonidElkin · 2025-12-01T11:26:55Z

src/pysatl_core/families/config.py

+def configure_family_register() -> None:
+    """
+    Configure and register all distribution families in the global registry.
+
+    This function initializes all parametric families with their respective
+    parameterizations, characteristics, and sampling strategies. It should be
+    called during application startup to make distributions available.
+    """
+    _configure_normal_family()


It is required @lru_cache(maxsize=1) decorator and returning ParametricFamilyRegister as well. Like in distributions/registry.py

@lru_cache(maxsize=1) def distribution_type_register() -> DistributionTypeRegister: reg = DistributionTypeRegister() _configure(reg) return reg

but you can do it like

@lru_cache(maxsize=1) def configure_family_register() -> ParametricFamilyRegister: _configure_normal_family() return ParametricFamilyRegister()

I do like the style of using ParametricFamilyRegister.register(Smth). It seems more natural in term of singletone object so you have no need in coding reg = DistributionTypeRegister() and pass it through the configuration function. I think I'll do the same thing in the new graph implementation. But we really need caching and returning the register object in such a function

LeonidElkin · 2025-12-01T11:37:18Z

src/pysatl_core/families/config.py

+@dataclass
+class MeanVarParametrization(Parametrization):
+    """
+    Mean-variance parametrization of normal distribution.
+
+    Parameters
+    ----------
+    mu : float
+        Mean of the distribution
+    sigma : float
+        Standard deviation of the distribution
+    """
+
+    mu: float
+    sigma: float
+
+    @constraint(description="sigma > 0")
+    def check_sigma_positive(self) -> bool:
+        """Check that standard deviation is positive."""
+        return self.sigma > 0
+
+
+@dataclass
+class MeanPrecParametrization(Parametrization):
+    """
+    Mean-precision parametrization of normal distribution.
+
+    Parameters
+    ----------
+    mu : float
+        Mean of the distribution
+    tau : float
+        Precision parameter (inverse variance)
+    """
+
+    mu: float
+    tau: float
+
+    @constraint(description="tau > 0")
+    def check_tau_positive(self) -> bool:
+        """Check that precision parameter is positive."""
+        return self.tau > 0
+
+    def transform_to_base_parametrization(self) -> Parametrization:
+        """
+        Transform to mean-variance parametrization.
+
+        Returns
+        -------
+        Parametrization
+            Mean-variance parametrization instance
+        """
+        sigma = math.sqrt(1 / self.tau)
+        return MeanVarParametrization(mu=self.mu, sigma=sigma)
+
+
+@dataclass
+class ExpParametrization(Parametrization):
+    """
+    Exponential family parametrization of normal distribution.
+        Uses the form: y = exp(a*x² + b*x + c)
+
+    Parameters
+    ----------
+    a : float
+        Quadratic term coefficient in exponential form
+    b : float
+        Linear term coefficient in exponential form
+    """
+
+    a: float
+    b: float
+
+    @property
+    def calculate_c(self) -> float:
+        """
+        Calculate the normalization constant c.
+
+        Returns
+        -------
+        float
+            Normalization constant
+        """
+        return (self.b**2) / (4 * self.a) - (1 / 2) * math.log(math.pi / (-self.a))
+
+    @constraint(description="a < 0")
+    def check_a_negative(self) -> bool:
+        """Check that quadratic term coefficient is negative."""
+        return self.a < 0
+
+    def transform_to_base_parametrization(self) -> Parametrization:
+        """
+        Transform to mean-variance parametrization.
+        Returns
+        -------
+        Parametrization
+            Mean-variance parametrization instance
+        """
+        mu = -self.b / (2 * self.a)
+        sigma = math.sqrt(-1 / (2 * self.a))
+        return MeanVarParametrization(mu=mu, sigma=sigma)


Why don't you use the @parametrization decorator? You can create these classes after creating Normal object and use @Normal.parametrization syntax. It's more smooth and natural in term of families module. You, by the way, don't have to use @dataclass deco. When you're using @parametrization deco class is beeing transformed into a dataclass.

It was decided to leave this issue unchanged in private conversation

LeonidElkin · 2025-12-01T11:41:52Z

src/pysatl_core/families/config.py

+    parametrization(family=Normal, name="meanVar")(MeanVarParametrization)
+    parametrization(family=Normal, name="meanPrec")(MeanPrecParametrization)
+    parametrization(family=Normal, name="exponential")(ExpParametrization)


As already been said, there is no need in it if you're using @parametrization deco

It was decided to leave this issue unchanged in private conversation

LeonidElkin · 2025-12-01T11:44:57Z

src/pysatl_core/families/config.py

+    def mean_func(parameters: Parametrization, __: Any = None) -> float:
+        """Mean of normal distribution."""
+        parameters = cast(MeanVarParametrization, parameters)
+        return parameters.mu
+
+    def var_func(parameters: Parametrization, __: Any = None) -> float:
+        """Variance of normal distribution."""
+        parameters = cast(MeanVarParametrization, parameters)
+        return parameters.sigma**2
+
+    def skew_func(_: Parametrization, __: Any = None) -> int:
+        """Skewness of normal distribution (always 0)."""
+        return 0
+
+    def raw_kurt_func(_: Parametrization, __: Any = None) -> int:
+        """Raw kurtosis of normal distribution (always 3)."""
+        return 3
+
+    def ex_kurt_func(_: Parametrization, __: Any) -> int:
+        """Excess kurtosis of normal distribution (always 0)."""
+        return 0


The fact that some arguments are not used is definitely a design problem, but let's use the following naming convention in such cases. If there is one unused argument, we name it _, if there are several, we call them _1, _2 and so on.

LeonidElkin · 2025-12-01T12:39:32Z

tests/unit/families/test_configuration.py

+        # Test meanPrec conversion
+        dist_mp = self.normal_family(mu=2.0, tau=0.25, parametrization_name="meanPrec")
+        base_from_mp = self.normal_family.to_base(dist_mp.parameters)
+        base_from_mp = cast(MeanVarParametrization, base_from_mp)
+        assert abs(base_from_mp.mu - 2.0) < 1e-10
+        assert abs(base_from_mp.sigma - 2.0) < 1e-10  # sigma = 1/sqrt(tau) = 1/sqrt(0.25) = 2
+
+        # Test exponential conversion
+        dist_exp = self.normal_family(a=-0.222, b=0.888, parametrization_name="exponential")
+        base_from_exp = self.normal_family.to_base(dist_exp.parameters)
+        base_from_exp = cast(MeanVarParametrization, base_from_exp)
+        # Should be approximately N(2, 1.5)
+        assert abs(base_from_exp.mu - 2.0) < 0.1
+        assert abs(base_from_exp.sigma - 1.5) < 0.1


Actually can move not the base parametrizations into a separated test and use parametric tests as well

LeonidElkin · 2025-12-01T12:43:27Z

tests/unit/families/test_configuration.py

+    def test_analytical_computations_caching(self):
+        """Test that analytical computations are properly cached."""
+        dist = self.normal_family(mu=0.0, sigma=1.0)
+
+        # Access analytical computations multiple times
+        comp1 = dist.analytical_computations
+        comp2 = dist.analytical_computations
+
+        # Should be the same object (cached)
+        assert comp1 is comp2
+
+        # Should contain expected characteristics
+        expected_chars = {
+            "pdf",
+            "cdf",
+            "ppf",
+            "char_func",
+            "mean",
+            "var",
+            "skewness",
+            "raw_kurtosis",
+            "excess_kurtosis",
+        }
+        assert set(comp1.keys()) == expected_chars


Caching is already tested in test_distribution_cache. Leave only this part

def test_analytical_computations_caching(self): """Test that analytical computations are properly cached.""" comp = self.normal_family(mu=0.0, sigma=1.0).analytical_computations # Should contain expected characteristics expected_chars = { "pdf", "cdf", "ppf", "char_func", "mean", "var", "skewness", "raw_kurtosis", "excess_kurtosis", } assert set(comp.keys()) == expected_chars

LeonidElkin · 2025-12-01T12:45:21Z

tests/unit/families/test_configuration.py

+    def test_array_input_support(self):
+        """Test that characteristics support array inputs."""
+        dist = self.normal_family(mu=0.0, sigma=1.0)
+
+        # Test with numpy array input
+        x_array = np.array([-2.0, -1.0, 0.0, 1.0, 2.0])
+
+        pdf = dist.computation_strategy.query_method("pdf", dist)
+        cdf = dist.computation_strategy.query_method("cdf", dist)
+
+        pdf_array = pdf(x_array)
+        cdf_array = cdf(x_array)
+
+        # Results should be arrays of same shape
+        assert pdf_array.shape == x_array.shape
+        assert cdf_array.shape == x_array.shape
+
+        # Compare with scipy
+        scipy_pdf = norm.pdf(x_array)
+        scipy_cdf = norm.cdf(x_array)
+
+        np.testing.assert_array_almost_equal(pdf_array, scipy_pdf, decimal=10)
+        np.testing.assert_array_almost_equal(cdf_array, scipy_cdf, decimal=10)


Don't do several characteristics. Either leave only one either do parametric tests where you test every single characteristic

LeonidElkin · 2025-12-01T12:46:19Z

tests/unit/families/test_configuration.py

+
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])


Why? Remove it

LeonidElkin · 2025-12-01T12:48:35Z

tests/unit/families/test_configuration.py

Add to setup something like ``self.normal_dist_example = self.normal_family(mu=2.0, sigma=1.5) ` to use it everywhere where parametrization and its parameters don't matter

…ce and readability increase

LeonidElkin

Minor changes are needed

LeonidElkin · 2025-12-01T22:03:06Z

src/pysatl_core/families/configuration.py

 from dataclasses import dataclass
-from typing import Any, cast
+from functools import lru_cache
+from typing import TYPE_CHECKING, Any, cast


Remove import Any from here. You already importing it while type checking

It is my fault, though. I gave you the wrong code sample. Double check me ;)

LeonidElkin · 2025-12-01T22:06:15Z

src/pysatl_core/families/registry.py

+    try:
+        from pysatl_core.families.configuration import configure_family_register
+
+        configure_family_register.cache_clear()
+    except ImportError:
+        pass


Why are you using try-except here? Why don't import it in the top of the file?

LeonidElkin · 2025-12-01T22:08:06Z

tests/unit/families/test_configuration.py

+    @pytest.mark.parametrize(
+        "t",
+        [
+            (-2.0),
+            (-1.0),
+            (0.0),
+            (1.0),
+            (2.0),
+        ],
+    )
+    def test_characteristic_function(self, t):


Try not to use one letter names. No clue what does t mean

And remove redundant parentheses

LeonidElkin · 2025-12-01T22:09:26Z

tests/unit/families/test_configuration.py

+    @pytest.mark.parametrize(
+        "char_func_getter, expected",
+        [
+            (lambda d: d.computation_strategy.query_method("mean", d)(None), 2.0),
+            (lambda d: d.computation_strategy.query_method("var", d)(None), 2.25),
+            (lambda d: d.computation_strategy.query_method("skewness", d)(None), 0.0),
+            (lambda d: d.computation_strategy.query_method("excess_kurtosis", d)(None), 0.0),
+        ],
+    )


Same as the comment above. But I do know what the d means ;) Allthough, use name like dist or distribution

And don't use None arg. You already have it as the default

LeonidElkin · 2025-12-01T22:46:33Z

tests/unit/families/test_configuration.py

+    @pytest.mark.parametrize(
+        "parametrization_name, params",
+        [
+            ("meanVar", {"mu": 2.0, "sigma": 1.5}),
+            ("meanPrec", {"mu": 2.0, "tau": 0.25}),
+            ("exponential", {"a": -0.222222, "b": 0.888889}),
+        ],
+    )
+    def test_parametrization_conversions(self, parametrization_name, params):
+        """Test conversions between different parameterizations."""
+        dist = self.normal_family(parametrization_name=parametrization_name, **params)
+        base_params = cast(
+            NormalMeanVarParametrization, self.normal_family.to_base(dist.parameters)
+        )
+        tolerance = self.CALCULATION_PRECISION

-    def test_moments(self):
-        """Test moment calculations."""
-        dist = self.normal_family(mu=2.0, sigma=1.5)
+        if parametrization_name == "meanPrec":
+            expected_sigma = math.sqrt(1 / params["tau"])

-        mean_func = dist.computation_strategy.query_method("mean", dist)
-        var_func = dist.computation_strategy.query_method("var", dist)
-        skew_func = dist.computation_strategy.query_method("skewness", dist)
-        kurt_func = dist.computation_strategy.query_method("excess_kurtosis", dist)
+        elif parametrization_name == "exponential":
+            expected_mu = -params["b"] / (2 * params["a"])
+            expected_sigma = math.sqrt(-1 / (2 * params["a"]))



If you need to if-else by parameters in parametric tests than you're doing smth wrong. Move expected mu, sigma to parameters. Also don't create useless dist object and do smth like this

self.normal_family.to_base( self.normal_family.get_parametrization(parametrization_name)(**params) )

And use self.CALCULATION_PRECISION don't make alias via tolerance variable

TheodorDM added 2 commits December 1, 2025 11:04

test(families): add test suite for normal distribution

62f39f1

- Add 16 tests covering all distribution characteristics - Test multiple parameterizations and conversions - Verify against scipy.stats.norm for correctness

MyznikovFD requested review from Desiment and LeonidElkin December 1, 2025 08:07

LeonidElkin requested changes Dec 1, 2025

View reviewed changes

refactor: improve normal family configuration with standards complian…

13d09c4

…ce and readability increase

MyznikovFD force-pushed the families/normal-family-configuration branch from 7ca3ea7 to 13d09c4 Compare December 1, 2025 21:05

MyznikovFD requested a review from LeonidElkin December 1, 2025 21:11

LeonidElkin requested changes Dec 1, 2025

View reviewed changes

Families/normal family configuration #58

Are you sure you want to change the base?

Families/normal family configuration #58

Conversation

MyznikovFD commented Dec 1, 2025

Uh oh!

LeonidElkin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LeonidElkin Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LeonidElkin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LeonidElkin Dec 1, 2025 •

edited

Loading