Continuous Fit Results β
Upon completion of the fitting process, the Phitter instance provides several methods and properties for examining and interpreting the outcome. Each method either returns a specific metric or presents a collection of fitted distributions in tabular form.
Global Results β
1. phi.summarize(k: int) -> pandas.DataFrame
β
Generates a concise summary table of the top-(k) distributions based on goodness-of-fit evaluations.
- k (int, optional): Defines how many distributions appear in the output. The default is 20.
phi.summarize(k=20)
# Returns a pandas.DataFrame containing columns such as:
# - distribution
# - sse
# - parameters
# - chi_square
# - kolmogorov_smirnov
# - anderson_darling (continuous case only)
distribution | sse | parameters | chi_square | kolmogorov_smirnov | anderson_darling |
---|---|---|---|---|---|
loggamma | 0.0000 | ... | β | β | β |
johnson_sb | 0.0000 | ... | β | β | β |
kumaraswamy | 0.0000 | ... | βοΈ | β | β |
dagum_4p | 0.0000 | ... | βοΈ | β | β |
beta_prime_4p | 0.0000 | ... | βοΈ | βοΈ | βοΈ |
2. phi.summarize_info(k: int) -> pandas.DataFrame
β
Generates a more detailed table than phi.summarize()
, including the raw (True/False) rejection results from each statistical test.
- k (int, optional): Defines how many distributions appear in the output. The default is 10.
phi.summarize_info(k=10)
# Returns a pandas.DataFrame containing columns such as:
# - distribution
# - sse
# - parameters
# - chi_square
# - kolmogorov_smirnov
# - anderson_darling (continuous case only)
3. phi.best_distribution -> dict
β
Provides a dictionary representing the distribution that achieved the highest evaluation criteria (primarily the lowest SSE and passing tests).
- Return Value:python
{ "id": "<distribution_name>", "parameters": { ... } }
Example:
phi.best_distribution
# -> {
# "id": "weibull",
# "parameters": {
# "alpha": 1.04,
# "beta": 120.5,
# ...
# }
# }
4. phi.sorted_distributions_sse -> dict
β
Provides a dictionary in which the keys are distribution identifiers and the values are nested dictionaries with metrics such as parameters, SSE, and the results of the statistical tests. The items are sorted primarily by the number of tests passed, then by the sum of squared errors.
phi.sorted_distributions_sse
# -> {
# "weibull": {
# "chi_square": {...},
# "kolmogorov_smirnov": {...},
# "anderson_darling": {...},
# "parameters": {...},
# "sse": float,
# "n_test_passed": int,
# "n_test_null": int
# },
# "gamma": {
# ...
# },
# ...
# }
5. phi.not_rejected_distributions -> dict
β
Provides a dictionary structured similarly to phi.sorted_distributions_sse
, containing only those distributions that have passed at least one statistical test (i.e., were not fully rejected).
phi.not_rejected_distributions
# -> {
# "weibull": {
# ... same structure ...
# },
# ...
# }
6. phi.df_sorted_distributions_sse -> pandas.DataFrame
β
Generates a Pandas DataFrame representation of phi.sorted_distributions_sse
. All distributions that were successfully fitted are included, organized by metrics and test results in a multi-index column structure.
phi.df_sorted_distributions_sse.head(10)
# -> Shows the top 10 distributions based on sum of squared errors (SSE) and test passes
7. phi.df_not_rejected_distributions -> pandas.DataFrame
β
Generates a Pandas DataFrame representation of phi.not_rejected_distributions
. This table contains only distributions for which at least one test was not rejected.
phi.df_not_rejected_distributions
# -> pandas.DataFrame with columns for distribution name, SSE, parameters, and test statistics
Results Specific Distribution β
These methods each require a distribution identifier (id_distribution
) to inspect specific metrics and results for a particular fitted distribution.
1. phi.get_parameters(id_distribution: str) -> dict
β
Retrieves the fitted parameters of the specified distribution.
phi.get_parameters(id_distribution="weibull")
# -> {
# "alpha": ...,
# "beta": ...,
# ...
# }
2. phi.get_test_chi_square(id_distribution: str) -> dict
β
Returns the Chi-Square test results as a dictionary containing the test statistic, critical value, p-value, and a boolean indicating whether the distribution is rejected under the given significance level.
phi.get_test_chi_square("weibull")
# -> {
# "test_statistic": ...,
# "critical_value": ...,
# "p_value": ...,
# "rejected": ...
# }
3. phi.get_test_kolmogorov_smirnov(id_distribution: str) -> dict
β
Returns the Kolmogorov-Smirnov test results as a dictionary with the same structure as in the Chi-Square test.
phi.get_test_kolmogorov_smirnov("weibull")
# -> {
# "test_statistic": ...,
# "critical_value": ...,
# "p_value": ...,
# "rejected": ...
# }
4. phi.get_test_anderson_darling(id_distribution: str) -> dict
β
Provides the Anderson-Darling test results, containing its test statistic, critical value(s), p-value (if applicable), and rejection status.
(Available for continuous fits only.)
phi.get_test_anderson_darling("weibull")
# -> {
# "test_statistic": ...,
# "critical_value": ...,
# "p_value": ...,
# "rejected": ...
# }
5. phi.get_sse(id_distribution: str) -> float
β
Retrieves the sum of squared errors (SSE) for the specified distributionβs probability function when compared to the empirical frequencies of the dataset.
phi.get_sse("weibull")
# -> 0.00215789
6. phi.get_n_test_passed(id_distribution: str) -> int
β
Indicates how many tests (among Chi-Square, Kolmogorov-Smirnov, and Anderson-Darling) did not reject the specified distribution.
phi.get_n_test_passed("weibull")
# -> 2
7. phi.get_n_test_null(id_distribution: str) -> int
β
Indicates how many tests were not evaluated or not applicable, leaving the results as None
. This situation can arise if certain metrics fail or if the distribution does not allow a specific test.
phi.get_n_test_null("weibull")
# -> 0
Usage Example β
Fitting a Dataset
pythonimport phitter data = [ ... ] # Dataset of numeric values phi = phitter.Phitter(data) phi.fit(n_workers=2)
Retrieving the Best Distribution
pythonbest_dist = phi.best_distribution # -> { # "id": "weibull", # "parameters": { # ... # } # }
Summarizing Results
pythonsummary_table = phi.summarize(k=10) display(summary_table)
Distribution-Specific Statistics
pythonparameters = phi.get_parameters("weibull") chi_result = phi.get_test_chi_square("weibull") kolmo_result = phi.get_test_kolmogorov_smirnov("weibull")
DataFrame of All Fitted Distributions
pythondf_all = phi.df_sorted_distributions_sse display(df_all.head(10))
By combining these methods and properties, thorough insights into the fitted distributions can be obtained. Both high-level overviews (for example, .best_distribution
and .summarize()
) and detailed verification of individual distributions (for example, get_test_chi_square(...)
, get_parameters(...)
) are available to facilitate a rigorous analysis of the results.