Chi-Square Goodness-of-Fit Test
The Chi-Square (
Test Procedure
The Chi-Square statistic is calculated using the formula:
where:
represents the observed frequency for bin . represents the expected frequency for bin , calculated based on the cumulative distribution function (CDF) of the proposed distribution. represents the number of bins or classes.
Degrees of Freedom
The test statistic follows a Chi-Square distribution with degrees of freedom calculated as:
where:
is the number of bins. is the number of parameters estimated from the data for the distribution being tested.
Hypothesis Testing
The hypothesis test is structured as follows:
- Null hypothesis
: The data follows the specified probability distribution. - Alternative hypothesis
: The data does not follow the specified probability distribution.
To make a decision regarding the hypotheses, two approaches are available:
Critical value approach: If the calculated
statistic exceeds the critical value from the Chi-Square distribution table at a given significance level ( ), the null hypothesis is rejected. p-value approach: If the calculated p-value is less than the chosen significance level (typically
), the null hypothesis is rejected.
Interpretation
- Rejected (True): The data significantly deviates from the proposed distribution.
- Not Rejected (False): The data is consistent with the proposed distribution.
Implementation in Phitter
In the Phitter library, the Chi-Square test is implemented via the method:
phi.get_test_chi_square(id_distribution: str) -> dict
Returned values:
test_statistic
: The computed Chi-Square statistic.critical_value
: The Chi-Square distribution critical value at the given significance level.p_value
: The p-value associated with the Chi-Square statistic.rejected
: Boolean indicating if the null hypothesis is rejected (True
) or not (False
).
Example Usage
import phitter
# Define dataset
data = [...]
# Fit distributions
phi = phitter.Phitter(data)
phi.fit()
# Chi-Square test for a specific distribution, e.g., "normal"
chi_square_results = phi.get_test_chi_square("normal")
print(chi_square_results)
This will output a dictionary containing the Chi-Square test statistic, critical value, p-value, and the rejection status of the null hypothesis.
Output:
{
"test_statistic": 4.621,
"critical_value": 11.070,
"p_value": 0.795,
"rejected": False
}
Apply Chi Square test to single distribution
Continous case
import phitter
# Define dataset
data = [...]
# Continous measures
continuous_measures = phitter.continuous.ContinuousMeasures(data)
# Define distribution instance
distribution_inst = phitter.continuous.Normal(continuous_measures=continuous_measures)
# Get test result
chi_square_results = phitter.continuous.evaluate_continuous_test_chi_square(distribution_inst, continuous_measures)
Discrete case
import phitter
# Define dataset
data = [...]
# Discrete measures
discrete_measures = phitter.discrete.DiscreteMeasures(data)
# Define distribution instance
distribution_inst = phitter.discrete.Binomial(discrete_measures=discrete_measures)
# Get test result
chi_square_results = phitter.discrete.evaluate_discrete_test_chi_square(distribution_inst, discrete_measures)