HROCH

Symbolic regression and classification library

License: MIT PyPI version Downloads CodeQL Unittests pages-build-deploymentUpload Python Package

High-Performance python symbolic regression library based on parallel local search

  • Zero hyperparameter tunning.
  • Accurate results in seconds or minutes, in contrast to slow GP-based methods.
  • Small models size.
  • Support for regression, classification and fuzzy math.
  • Support 32 and 64 bit floating point arithmetic.
  • Work with unprotected version of math operators (log, sqrt, division)
  • Speedup search by using feature importances computed from bbox model
Supported instructions
math add, sub, mul, div, pdiv, inv, minv, sq2, pow, exp, log, sqrt, cbrt, aq
goniometric sin, cos, tan, asin, acos, atan, sinh, cosh, tanh
other nop, max, min, abs, floor, ceil, lt, gt, lte, gte
fuzzy f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl

Sources

C++20 source code available in separate repo sr_core

Dependencies

  • AVX2 instructions set(all modern CPU support this)
  • numpy
  • sklearn

Installation

pip install HROCH

Usage

Symbolic_Regression_Demo.ipynb Colab

Documentation

from HROCH import SymbolicRegressor

reg = SymbolicRegressor(num_threads=8, time_limit=60.0, problem='math', precision='f64')
reg.fit(X_train, y_train)
yp = reg.predict(X_test)

Changelog

v1.4

  • Sklearn compatibility
  • Classificators:
    • NonlinearLogisticRegressor for a binary classification
    • SymbolicClassifier for multiclass classification
    • FuzzyRegressor for a special binary classification
  • Xi corelation used for filter unrelated features

Older versions

v1.3

  • Public c++ sources
  • Commanline interface changed to cpython
  • Support for classification score logloss and accuracy
  • Support for final transformations:
    • ordinal regression
    • logistic function
    • clipping
  • Acess to equations from all paralel hillclimbers
  • User defined constants

v1.2

  • Features probability as input parameter
  • Custom instructions set
  • Parallel hilclimbing parameters

v1.1

  • Improved late acceptance hillclimbing

v1.0

  • First release

SRBench

full results

SRBench

 1"""
 2.. include:: ../README.md
 3"""
 4
 5from .hroch import RegressorMathModel, ClassifierMathModel, Xicor
 6from .regressor import SymbolicRegressor
 7from .fuzzy import FuzzyRegressor, FuzzyClassifier
 8from .classifier import NonlinearLogisticRegressor, SymbolicClassifier
 9from .version import __version__
10
11__all__ = ["SymbolicRegressor", "NonlinearLogisticRegressor", "SymbolicClassifier", "FuzzyRegressor", "FuzzyClassifier", "RegressorMathModel", "ClassifierMathModel", "Xicor", "__version__"]
class SymbolicRegressor(sklearn.base.RegressorMixin, HROCH.hroch.SymbolicSolver):
  7class SymbolicRegressor(RegressorMixin, SymbolicSolver):
  8    """
  9    SymbolicRegressor class
 10
 11    Parameters
 12    ----------
 13    num_threads : int, default=1
 14        Number of used threads.
 15
 16    time_limit : float, default=5.0
 17        Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
 18
 19    iter_limit : int, default=0
 20        Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
 21
 22    precision : str, default='f32'
 23        'f64' or 'f32'. Internal floating number representation.
 24
 25    problem : str or dict, default='math'
 26        Predefined instructions sets 'math' or 'simple' or 'fuzzy' or custom defines set of instructions with mutation probability.
 27        ```python
 28        problem={'add':10.0, 'mul':10.0, 'gt':1.0, 'lt':1.0, 'nop':1.0}
 29        ```
 30
 31        |**supported instructions**||
 32        |-|-|
 33        |**math**|add, sub, mul, div, pdiv, inv, minv, sq2, pow, exp, log, sqrt, cbrt, aq|
 34        |**goniometric**|sin, cos, tan, asin, acos, atan, sinh, cosh, tanh|
 35        |**other**|nop, max, min, abs, floor, ceil, lt, gt, lte, gte|
 36        |**fuzzy**|f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl|
 37
 38        *nop - no operation*
 39
 40        *pdiv - protected division*
 41
 42        *inv - inverse* $(-x)$
 43
 44        *minv - multiplicative inverse* $(1/x)$
 45
 46        *lt, gt, lte, gte -* $<, >, <=, >=$
 47
 48    feature_probs : str or array of shape (n_features,), default='xicor'
 49        The probability that a mutation will select a feature.
 50        If None then the features are selected with equal probability.
 51        If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
 52
 53    random_state : int, default=0
 54        Random generator seed. If 0 then random generator will be initialized by system time.
 55
 56    verbose : int, default=0
 57        Controls the verbosity when fitting and predicting.
 58
 59    metric : str, default='MSE'
 60            Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
 61
 62    transformation : str, default=None
 63        Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
 64
 65    algo_settings : dict, default = None
 66        If not defined SymbolicSolver.ALGO_SETTINGS is used.
 67        ```python
 68        algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
 69        ```
 70        - 'neighbours_count' : (int) Number tested neighbours in each iteration
 71        - 'alpha' : (float) Score worsening limit for a iteration
 72        - 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
 73        - 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
 74        - 'sample_size : (int) Number of batches of sample used to calculate the score during training
 75
 76    code_settings : dict, default = None
 77        If not defined SymbolicSolver.CODE_SETTINGS is used.
 78        ```python
 79        code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
 80        ```
 81        - 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
 82        - 'min_size': (int) Minimum allowed equation size(as a linear program).
 83        - 'max_size' : (int) Maximum allowed equation size(as a linear program).
 84
 85    population_settings : dict, default = None
 86        If not defined SymbolicSolver.POPULATION_SETTINGS is used.
 87        ```python
 88        population_settings = {'size': 64, 'tournament':4}
 89        ```
 90        - 'size' : (int) Number of individuals in the population.
 91        - 'tournament' : (int) Tournament selection.
 92
 93    init_const_settings : dict, default = None
 94        If not defined SymbolicSolver.INIT_CONST_SETTINGS is used.
 95        ```python
 96        init_const_settings = {'const_min':-1.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
 97        ```
 98        - 'const_min' : (float) Lower range for initializing constants.
 99        - 'const_max' : (float) Upper range for initializing constants.
100        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
101        - 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
102
103    const_settings : dict, default = None
104        If not defined SymbolicSolver.CONST_SETTINGS is used.
105        ```python
106        const_settings = {'const_min':-LARGE_FLOAT, 'const_max':LARGE_FLOAT, 'predefined_const_prob':0.0, 'predefined_const_set': []}
107        ```
108        - 'const_min' : (float) Lower range for constants used in equations.
109        - 'const_max' : (float) Upper range for constants used in equations.
110        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
111        - 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
112
113    target_clip : array of two float values clip_min and clip_max, default None
114        ```python
115        target_clip=[-1, 1]
116        ```
117
118    cv_params : dict, default = None
119        If not defined SymbolicSolver.REGRESSION_CV_PARAMS is used
120        ```python
121        cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(mean_squared_error, greater_is_better=False)}
122        ```
123        - 'n' : (int) Crossvalidate n top models
124        - 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
125        - select : (str) Best model selection method choose from 'mean'or 'median'
126        - opt_params : (dict) Parameters passed to scipy.optimize.minimize method
127        - opt_metric : (make_scorer) Scoring method
128
129    warm_start : bool, default=False
130        If True, then the solver will be reused for the next call of fit.
131    """
132
133    def __init__(
134        self,
135        num_threads: int = 1,
136        time_limit: float = 5.0,
137        iter_limit: int = 0,
138        precision: str = "f32",
139        problem="math",
140        feature_probs="xicor",
141        random_state: int = 0,
142        verbose: int = 0,
143        metric: str = "MSE",
144        transformation: str = None,
145        algo_settings=None,
146        code_settings=None,
147        population_settings=None,
148        init_const_settings=None,
149        const_settings=None,
150        target_clip: Iterable = None,
151        cv_params=None,
152        warm_start: bool = False,
153    ):
154        super(SymbolicRegressor, self).__init__(
155            num_threads=num_threads,
156            time_limit=time_limit,
157            iter_limit=iter_limit,
158            precision=precision,
159            problem=problem,
160            feature_probs=feature_probs,
161            random_state=random_state,
162            verbose=verbose,
163            metric=metric,
164            transformation=transformation,
165            algo_settings=algo_settings,
166            code_settings=code_settings,
167            population_settings=population_settings,
168            init_const_settings=init_const_settings,
169            const_settings=const_settings,
170            target_clip=target_clip,
171            class_weight=None,
172            cv_params=cv_params,
173            warm_start=warm_start,
174        )
175
176    def fit(self, X, y, sample_weight=None, check_input=True):
177        """
178        Fit the symbolic models according to the given training data.
179
180        Parameters
181        ----------
182        X : array-like of shape (n_samples, n_features)
183            Training vector, where `n_samples` is the number of samples and
184            `n_features` is the number of features.
185
186        y : array-like of shape (n_samples,)
187            Target vector relative to X.
188
189        sample_weight : array-like of shape (n_samples,) default=None
190            Array of weights that are assigned to individual samples.
191            If not provided, then each sample is given unit weight.
192
193        check_input : bool, default=True
194            Allow to bypass several input checking.
195            Don't use this parameter unless you know what you're doing.
196
197        Returns
198        -------
199        self
200            Fitted estimator.
201        """
202
203        super(SymbolicRegressor, self).fit(X, y, sample_weight=sample_weight, check_input=check_input)
204        return self
205
206    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
207        """
208        Predict regression target for X.
209
210        Parameters
211        ----------
212        X : array-like of shape (n_samples, n_features)
213            The input samples.
214
215        id : int
216            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
217
218        check_input : bool, default=True
219            Allow to bypass several input checking.
220            Don't use this parameter unless you know what you're doing.
221
222        Returns
223        -------
224        y : ndarray of shape (n_samples,) or (n_samples, n_outputs)
225            The predicted values.
226        """
227        return super(SymbolicRegressor, self).predict(X, id=id, check_input=check_input, use_parsed_model=use_parsed_model)
228
229    def __sklearn_tags__(self):
230        return super().__sklearn_tags__()

SymbolicRegressor class

Parameters
  • num_threads (int, default=1): Number of used threads.
  • time_limit (float, default=5.0): Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
  • iter_limit (int, default=0): Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
  • precision (str, default='f32'): 'f64' or 'f32'. Internal floating number representation.
  • problem (str or dict, default='math'): Predefined instructions sets 'math' or 'simple' or 'fuzzy' or custom defines set of instructions with mutation probability.

    problem={'add':10.0, 'mul':10.0, 'gt':1.0, 'lt':1.0, 'nop':1.0}
    
    supported instructions
    math add, sub, mul, div, pdiv, inv, minv, sq2, pow, exp, log, sqrt, cbrt, aq
    goniometric sin, cos, tan, asin, acos, atan, sinh, cosh, tanh
    other nop, max, min, abs, floor, ceil, lt, gt, lte, gte
    fuzzy f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl

    nop - no operation

    pdiv - protected division

    inv - inverse $(-x)$

    minv - multiplicative inverse $(1/x)$

    lt, gt, lte, gte - $<, >, <=, >=$

  • feature_probs (str or array of shape (n_features,), default='xicor'): The probability that a mutation will select a feature. If None then the features are selected with equal probability. If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
  • random_state (int, default=0): Random generator seed. If 0 then random generator will be initialized by system time.
  • verbose (int, default=0): Controls the verbosity when fitting and predicting.
  • metric (str, default='MSE'): Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
  • transformation (str, default=None): Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
  • algo_settings (dict, default = None): If not defined SymbolicSolver.ALGO_SETTINGS is used.

    algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
    
    • 'neighbours_count' : (int) Number tested neighbours in each iteration
    • 'alpha' : (float) Score worsening limit for a iteration
    • 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
    • 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
    • 'sample_size : (int) Number of batches of sample used to calculate the score during training
  • code_settings (dict, default = None): If not defined SymbolicSolver.CODE_SETTINGS is used.

    code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
    
    • 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
    • 'min_size': (int) Minimum allowed equation size(as a linear program).
    • 'max_size' : (int) Maximum allowed equation size(as a linear program).
  • population_settings (dict, default = None): If not defined SymbolicSolver.POPULATION_SETTINGS is used.

    population_settings = {'size': 64, 'tournament':4}
    
    • 'size' : (int) Number of individuals in the population.
    • 'tournament' : (int) Tournament selection.
  • init_const_settings (dict, default = None): If not defined SymbolicSolver.INIT_CONST_SETTINGS is used.

    init_const_settings = {'const_min':-1.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for initializing constants.
    • 'const_max' : (float) Upper range for initializing constants.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
    • 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
  • const_settings (dict, default = None): If not defined SymbolicSolver.CONST_SETTINGS is used.

    const_settings = {'const_min':-LARGE_FLOAT, 'const_max':LARGE_FLOAT, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for constants used in equations.
    • 'const_max' : (float) Upper range for constants used in equations.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
    • 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
  • target_clip (array of two float values clip_min and clip_max, default None):

    target_clip=[-1, 1]
    
  • cv_params (dict, default = None): If not defined SymbolicSolver.REGRESSION_CV_PARAMS is used

    cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(mean_squared_error, greater_is_better=False)}
    
    • 'n' : (int) Crossvalidate n top models
    • 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
    • select : (str) Best model selection method choose from 'mean'or 'median'
    • opt_params : (dict) Parameters passed to scipy.optimize.minimize method
    • opt_metric : (make_scorer) Scoring method
  • warm_start (bool, default=False): If True, then the solver will be reused for the next call of fit.
SymbolicRegressor( num_threads: int = 1, time_limit: float = 5.0, iter_limit: int = 0, precision: str = 'f32', problem='math', feature_probs='xicor', random_state: int = 0, verbose: int = 0, metric: str = 'MSE', transformation: str = None, algo_settings=None, code_settings=None, population_settings=None, init_const_settings=None, const_settings=None, target_clip: Iterable = None, cv_params=None, warm_start: bool = False)
133    def __init__(
134        self,
135        num_threads: int = 1,
136        time_limit: float = 5.0,
137        iter_limit: int = 0,
138        precision: str = "f32",
139        problem="math",
140        feature_probs="xicor",
141        random_state: int = 0,
142        verbose: int = 0,
143        metric: str = "MSE",
144        transformation: str = None,
145        algo_settings=None,
146        code_settings=None,
147        population_settings=None,
148        init_const_settings=None,
149        const_settings=None,
150        target_clip: Iterable = None,
151        cv_params=None,
152        warm_start: bool = False,
153    ):
154        super(SymbolicRegressor, self).__init__(
155            num_threads=num_threads,
156            time_limit=time_limit,
157            iter_limit=iter_limit,
158            precision=precision,
159            problem=problem,
160            feature_probs=feature_probs,
161            random_state=random_state,
162            verbose=verbose,
163            metric=metric,
164            transformation=transformation,
165            algo_settings=algo_settings,
166            code_settings=code_settings,
167            population_settings=population_settings,
168            init_const_settings=init_const_settings,
169            const_settings=const_settings,
170            target_clip=target_clip,
171            class_weight=None,
172            cv_params=cv_params,
173            warm_start=warm_start,
174        )
def fit(self, X, y, sample_weight=None, check_input=True):
176    def fit(self, X, y, sample_weight=None, check_input=True):
177        """
178        Fit the symbolic models according to the given training data.
179
180        Parameters
181        ----------
182        X : array-like of shape (n_samples, n_features)
183            Training vector, where `n_samples` is the number of samples and
184            `n_features` is the number of features.
185
186        y : array-like of shape (n_samples,)
187            Target vector relative to X.
188
189        sample_weight : array-like of shape (n_samples,) default=None
190            Array of weights that are assigned to individual samples.
191            If not provided, then each sample is given unit weight.
192
193        check_input : bool, default=True
194            Allow to bypass several input checking.
195            Don't use this parameter unless you know what you're doing.
196
197        Returns
198        -------
199        self
200            Fitted estimator.
201        """
202
203        super(SymbolicRegressor, self).fit(X, y, sample_weight=sample_weight, check_input=check_input)
204        return self

Fit the symbolic models according to the given training data.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like of shape (n_samples,)): Target vector relative to X.
  • sample_weight (array-like of shape (n_samples,) default=None): Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • self: Fitted estimator.
def predict(self, X, id=None, check_input=True, use_parsed_model=True):
206    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
207        """
208        Predict regression target for X.
209
210        Parameters
211        ----------
212        X : array-like of shape (n_samples, n_features)
213            The input samples.
214
215        id : int
216            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
217
218        check_input : bool, default=True
219            Allow to bypass several input checking.
220            Don't use this parameter unless you know what you're doing.
221
222        Returns
223        -------
224        y : ndarray of shape (n_samples,) or (n_samples, n_outputs)
225            The predicted values.
226        """
227        return super(SymbolicRegressor, self).predict(X, id=id, check_input=check_input, use_parsed_model=use_parsed_model)

Predict regression target for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
  • id (int): Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • y (ndarray of shape (n_samples,) or (n_samples, n_outputs)): The predicted values.
def set_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class NonlinearLogisticRegressor(sklearn.base.ClassifierMixin, HROCH.hroch.SymbolicSolver):
 12class NonlinearLogisticRegressor(ClassifierMixin, SymbolicSolver):
 13    """
 14    Nonlinear Logistic Regressor
 15
 16    Parameters
 17    ----------
 18    num_threads : int, default=1
 19        Number of used threads.
 20
 21    time_limit : float, default=5.0
 22        Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
 23
 24    iter_limit : int, default=0
 25        Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
 26
 27    precision : str, default='f32'
 28        'f64' or 'f32'. Internal floating number representation.
 29
 30    problem : str or dict, default='math'
 31        Predefined instructions sets 'math' or 'simple' or 'fuzzy' or custom defines set of instructions with mutation probability.
 32        ```python
 33        problem={'add':10.0, 'mul':10.0, 'gt':1.0, 'lt':1.0, 'nop':1.0}
 34        ```
 35
 36        |**supported instructions**||
 37        |-|-|
 38        |**math**|add, sub, mul, div, pdiv, inv, minv, sq2, pow, exp, log, sqrt, cbrt, aq|
 39        |**goniometric**|sin, cos, tan, asin, acos, atan, sinh, cosh, tanh|
 40        |**other**|nop, max, min, abs, floor, ceil, lt, gt, lte, gte|
 41        |**fuzzy**|f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl|
 42
 43        *nop - no operation*
 44
 45        *pdiv - protected division*
 46
 47        *inv - inverse* $(-x)$
 48
 49        *minv - multiplicative inverse* $(1/x)$
 50
 51        *lt, gt, lte, gte -* $<, >, <=, >=$
 52
 53    feature_probs : str or array of shape (n_features,), default='xicor'
 54        The probability that a mutation will select a feature.
 55        If None then the features are selected with equal probability.
 56        If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
 57
 58    random_state : int, default=0
 59        Random generator seed. If 0 then random generator will be initialized by system time.
 60
 61    verbose : int, default=0
 62        Controls the verbosity when fitting and predicting.
 63
 64    metric : str, default='LogLoss'
 65        Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
 66
 67    transformation : str, default='LOGISTIC'
 68        Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
 69
 70    algo_settings : dict, default = None
 71        If not defined SymbolicSolver.ALGO_SETTINGS is used.
 72        ```python
 73        algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
 74        ```
 75        - 'neighbours_count' : (int) Number tested neighbours in each iteration
 76        - 'alpha' : (float) Score worsening limit for a iteration
 77        - 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
 78        - 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
 79        - 'sample_size : (int) Number of batches of sample used to calculate the score during training
 80
 81    code_settings : dict, default = None
 82        If not defined SymbolicSolver.CODE_SETTINGS is used.
 83        ```python
 84        code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
 85        ```
 86        - 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
 87        - 'min_size': (int) Minimum allowed equation size(as a linear program).
 88        - 'max_size' : (int) Maximum allowed equation size(as a linear program).
 89
 90    population_settings : dict, default = None
 91        If not defined SymbolicSolver.POPULATION_SETTINGS is used.
 92        ```python
 93        population_settings = {'size': 64, 'tournament':4}
 94        ```
 95        - 'size' : (int) Number of individuals in the population.
 96        - 'tournament' : (int) Tournament selection.
 97
 98    init_const_settings : dict, default = None
 99        If not defined SymbolicSolver.INIT_CONST_SETTINGS is used.
100        ```python
101        init_const_settings = {'const_min':-1.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
102        ```
103        - 'const_min' : (float) Lower range for initializing constants.
104        - 'const_max' : (float) Upper range for initializing constants.
105        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
106        - 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
107
108    const_settings : dict, default = None
109        If not defined SymbolicSolver.CONST_SETTINGS is used.
110        ```python
111        const_settings = {'const_min':-LARGE_FLOAT, 'const_max':LARGE_FLOAT, 'predefined_const_prob':0.0, 'predefined_const_set': []}
112        ```
113        - 'const_min' : (float) Lower range for constants used in equations.
114        - 'const_max' : (float) Upper range for constants used in equations.
115        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
116        - 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
117
118    target_clip : array, default = None
119        Array of two float values clip_min and clip_max.
120        If not defined SymbolicSolver.CLASSIFICATION_TARGET_CLIP is used.
121        ```python
122        target_clip=[3e-7, 1.0-3e-7]
123        ```
124    class_weight : dict or 'balanced', default=None
125        Weights associated with classes in the form ``{class_label: weight}``.
126        If not given, all classes are supposed to have weight one.
127
128        The "balanced" mode uses the values of y to automatically adjust
129        weights inversely proportional to class frequencies in the input data
130        as ``n_samples / (n_classes * np.bincount(y))``.
131
132        Note that these weights will be multiplied with sample_weight (passed
133        through the fit method) if sample_weight is specified.
134
135    cv_params : dict, default = None
136        If not defined SymbolicSolver.CLASSIFICATION_CV_PARAMS is used.
137        ```python
138        cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(log_loss, greater_is_better=False, response_method='predict_proba')}
139        ```
140        - 'n' : (int) Crossvalidate n top models
141        - 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
142        - select : (str) Best model selection method choose from 'mean'or 'median'
143        - opt_params : (dict) Parameters passed to scipy.optimize.minimize method
144        - opt_metric : (make_scorer) Scoring method
145
146    warm_start : bool, default=False
147        If True, then the solver will be reused for the next call of fit.
148    """
149
150    def __init__(
151        self,
152        num_threads: int = 1,
153        time_limit: float = 5.0,
154        iter_limit: int = 0,
155        precision: str = "f32",
156        problem="math",
157        feature_probs="xicor",
158        random_state: int = 0,
159        verbose: int = 0,
160        metric: str = "LogLoss",
161        transformation: str = "LOGISTIC",
162        algo_settings=None,
163        code_settings=None,
164        population_settings=None,
165        init_const_settings=None,
166        const_settings=None,
167        target_clip=None,
168        class_weight=None,
169        cv_params=None,
170        warm_start: bool = False,
171    ):
172
173        super(NonlinearLogisticRegressor, self).__init__(
174            num_threads=num_threads,
175            time_limit=time_limit,
176            iter_limit=iter_limit,
177            precision=precision,
178            problem=problem,
179            feature_probs=feature_probs,
180            random_state=random_state,
181            verbose=verbose,
182            metric=metric,
183            transformation=transformation,
184            algo_settings=algo_settings,
185            code_settings=code_settings,
186            population_settings=population_settings,
187            init_const_settings=init_const_settings,
188            const_settings=const_settings,
189            target_clip=target_clip,
190            class_weight=class_weight,
191            cv_params=cv_params,
192            warm_start=warm_start,
193        )
194
195    def fit(self, X, y, sample_weight=None, check_input=True):
196        """
197        Fit the symbolic models according to the given training data.
198
199        Parameters
200        ----------
201        X : array-like of shape (n_samples, n_features)
202            Training vector, where `n_samples` is the number of samples and
203            `n_features` is the number of features.
204
205        y : array-like of shape (n_samples,)
206            Target vector relative to X. Needs samples of 2 classes.
207
208        sample_weight : array-like of shape (n_samples,) default=None
209            Array of weights that are assigned to individual samples.
210            If not provided, then each sample is given unit weight.
211
212        check_input : bool, default=True
213            Allow to bypass several input checking.
214            Don't use this parameter unless you know what you're doing.
215
216        Returns
217        -------
218        self
219            Fitted estimator.
220        """
221
222        if check_input:
223            X, y = validate_data(self, X, y, accept_sparse=False, y_numeric=False, multi_output=False)
224
225        y_type = type_of_target(y, input_name="y", raise_unknown=True)
226        if y_type != "binary":
227            raise ValueError("Only binary classification is supported. The type of the target " f"is {y_type}.")
228        check_classification_targets(y)
229        enc = LabelEncoder()
230        y_ind = enc.fit_transform(y)
231        self.classes_ = enc.classes_
232        self.n_classes_ = len(self.classes_)
233        if self.n_classes_ != 2:
234            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
235
236        self.class_weight_ = compute_class_weight(self.class_weight, classes=self.classes_, y=y)
237
238        super(NonlinearLogisticRegressor, self).fit(X, y_ind, sample_weight=sample_weight, check_input=False)
239        return self
240
241    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
242        """
243        Predict class for X.
244
245        Parameters
246        ----------
247        X : array-like of shape (n_samples, n_features)
248            The input samples.
249
250        id : int
251            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
252
253        check_input : bool, default=True
254            Allow to bypass several input checking.
255            Don't use this parameter unless you know what you're doing.
256
257        Returns
258        -------
259        y : ndarray of shape (n_samples,)
260            The predicted classes.
261        """
262        preds = super(NonlinearLogisticRegressor, self).predict(X, id, check_input=check_input, use_parsed_model=use_parsed_model)
263        return self.classes_[(preds > 0.5).astype(int)]
264
265    def predict_proba(self, X, id=None, check_input=True):
266        """
267        Predict class probabilities for X.
268
269        Parameters
270        ----------
271        X : array-like of shape (n_samples, n_features)
272
273        check_input : bool, default=True
274            Allow to bypass several input checking.
275            Don't use this parameter unless you know what you're doing.
276
277        Returns
278        -------
279        T : ndarray of shape (n_samples, n_classes)
280            The class probabilities of the input samples. The order of the
281            classes corresponds to that in the attribute :term:`classes_`.
282        """
283        preds = super(NonlinearLogisticRegressor, self).predict(X, id, check_input=check_input)
284        proba = numpy.vstack([1 - preds, preds]).T
285        return proba
286
287    def __sklearn_tags__(self):
288        tags = super().__sklearn_tags__()
289        tags.classifier_tags = ClassifierTags(multi_class=False)
290        return tags

Nonlinear Logistic Regressor

Parameters
  • num_threads (int, default=1): Number of used threads.
  • time_limit (float, default=5.0): Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
  • iter_limit (int, default=0): Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
  • precision (str, default='f32'): 'f64' or 'f32'. Internal floating number representation.
  • problem (str or dict, default='math'): Predefined instructions sets 'math' or 'simple' or 'fuzzy' or custom defines set of instructions with mutation probability.

    problem={'add':10.0, 'mul':10.0, 'gt':1.0, 'lt':1.0, 'nop':1.0}
    
    supported instructions
    math add, sub, mul, div, pdiv, inv, minv, sq2, pow, exp, log, sqrt, cbrt, aq
    goniometric sin, cos, tan, asin, acos, atan, sinh, cosh, tanh
    other nop, max, min, abs, floor, ceil, lt, gt, lte, gte
    fuzzy f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl

    nop - no operation

    pdiv - protected division

    inv - inverse $(-x)$

    minv - multiplicative inverse $(1/x)$

    lt, gt, lte, gte - $<, >, <=, >=$

  • feature_probs (str or array of shape (n_features,), default='xicor'): The probability that a mutation will select a feature. If None then the features are selected with equal probability. If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
  • random_state (int, default=0): Random generator seed. If 0 then random generator will be initialized by system time.
  • verbose (int, default=0): Controls the verbosity when fitting and predicting.
  • metric (str, default='LogLoss'): Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
  • transformation (str, default='LOGISTIC'): Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
  • algo_settings (dict, default = None): If not defined SymbolicSolver.ALGO_SETTINGS is used.

    algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
    
    • 'neighbours_count' : (int) Number tested neighbours in each iteration
    • 'alpha' : (float) Score worsening limit for a iteration
    • 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
    • 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
    • 'sample_size : (int) Number of batches of sample used to calculate the score during training
  • code_settings (dict, default = None): If not defined SymbolicSolver.CODE_SETTINGS is used.

    code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
    
    • 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
    • 'min_size': (int) Minimum allowed equation size(as a linear program).
    • 'max_size' : (int) Maximum allowed equation size(as a linear program).
  • population_settings (dict, default = None): If not defined SymbolicSolver.POPULATION_SETTINGS is used.

    population_settings = {'size': 64, 'tournament':4}
    
    • 'size' : (int) Number of individuals in the population.
    • 'tournament' : (int) Tournament selection.
  • init_const_settings (dict, default = None): If not defined SymbolicSolver.INIT_CONST_SETTINGS is used.

    init_const_settings = {'const_min':-1.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for initializing constants.
    • 'const_max' : (float) Upper range for initializing constants.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
    • 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
  • const_settings (dict, default = None): If not defined SymbolicSolver.CONST_SETTINGS is used.

    const_settings = {'const_min':-LARGE_FLOAT, 'const_max':LARGE_FLOAT, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for constants used in equations.
    • 'const_max' : (float) Upper range for constants used in equations.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
    • 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
  • target_clip (array, default = None): Array of two float values clip_min and clip_max. If not defined SymbolicSolver.CLASSIFICATION_TARGET_CLIP is used.

    target_clip=[3e-7, 1.0-3e-7]
    
  • class_weight (dict or 'balanced', default=None): Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

    Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

  • cv_params (dict, default = None): If not defined SymbolicSolver.CLASSIFICATION_CV_PARAMS is used.

    cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(log_loss, greater_is_better=False, response_method='predict_proba')}
    
    • 'n' : (int) Crossvalidate n top models
    • 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
    • select : (str) Best model selection method choose from 'mean'or 'median'
    • opt_params : (dict) Parameters passed to scipy.optimize.minimize method
    • opt_metric : (make_scorer) Scoring method
  • warm_start (bool, default=False): If True, then the solver will be reused for the next call of fit.
NonlinearLogisticRegressor( num_threads: int = 1, time_limit: float = 5.0, iter_limit: int = 0, precision: str = 'f32', problem='math', feature_probs='xicor', random_state: int = 0, verbose: int = 0, metric: str = 'LogLoss', transformation: str = 'LOGISTIC', algo_settings=None, code_settings=None, population_settings=None, init_const_settings=None, const_settings=None, target_clip=None, class_weight=None, cv_params=None, warm_start: bool = False)
150    def __init__(
151        self,
152        num_threads: int = 1,
153        time_limit: float = 5.0,
154        iter_limit: int = 0,
155        precision: str = "f32",
156        problem="math",
157        feature_probs="xicor",
158        random_state: int = 0,
159        verbose: int = 0,
160        metric: str = "LogLoss",
161        transformation: str = "LOGISTIC",
162        algo_settings=None,
163        code_settings=None,
164        population_settings=None,
165        init_const_settings=None,
166        const_settings=None,
167        target_clip=None,
168        class_weight=None,
169        cv_params=None,
170        warm_start: bool = False,
171    ):
172
173        super(NonlinearLogisticRegressor, self).__init__(
174            num_threads=num_threads,
175            time_limit=time_limit,
176            iter_limit=iter_limit,
177            precision=precision,
178            problem=problem,
179            feature_probs=feature_probs,
180            random_state=random_state,
181            verbose=verbose,
182            metric=metric,
183            transformation=transformation,
184            algo_settings=algo_settings,
185            code_settings=code_settings,
186            population_settings=population_settings,
187            init_const_settings=init_const_settings,
188            const_settings=const_settings,
189            target_clip=target_clip,
190            class_weight=class_weight,
191            cv_params=cv_params,
192            warm_start=warm_start,
193        )
def fit(self, X, y, sample_weight=None, check_input=True):
195    def fit(self, X, y, sample_weight=None, check_input=True):
196        """
197        Fit the symbolic models according to the given training data.
198
199        Parameters
200        ----------
201        X : array-like of shape (n_samples, n_features)
202            Training vector, where `n_samples` is the number of samples and
203            `n_features` is the number of features.
204
205        y : array-like of shape (n_samples,)
206            Target vector relative to X. Needs samples of 2 classes.
207
208        sample_weight : array-like of shape (n_samples,) default=None
209            Array of weights that are assigned to individual samples.
210            If not provided, then each sample is given unit weight.
211
212        check_input : bool, default=True
213            Allow to bypass several input checking.
214            Don't use this parameter unless you know what you're doing.
215
216        Returns
217        -------
218        self
219            Fitted estimator.
220        """
221
222        if check_input:
223            X, y = validate_data(self, X, y, accept_sparse=False, y_numeric=False, multi_output=False)
224
225        y_type = type_of_target(y, input_name="y", raise_unknown=True)
226        if y_type != "binary":
227            raise ValueError("Only binary classification is supported. The type of the target " f"is {y_type}.")
228        check_classification_targets(y)
229        enc = LabelEncoder()
230        y_ind = enc.fit_transform(y)
231        self.classes_ = enc.classes_
232        self.n_classes_ = len(self.classes_)
233        if self.n_classes_ != 2:
234            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
235
236        self.class_weight_ = compute_class_weight(self.class_weight, classes=self.classes_, y=y)
237
238        super(NonlinearLogisticRegressor, self).fit(X, y_ind, sample_weight=sample_weight, check_input=False)
239        return self

Fit the symbolic models according to the given training data.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like of shape (n_samples,)): Target vector relative to X. Needs samples of 2 classes.
  • sample_weight (array-like of shape (n_samples,) default=None): Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • self: Fitted estimator.
def predict(self, X, id=None, check_input=True, use_parsed_model=True):
241    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
242        """
243        Predict class for X.
244
245        Parameters
246        ----------
247        X : array-like of shape (n_samples, n_features)
248            The input samples.
249
250        id : int
251            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
252
253        check_input : bool, default=True
254            Allow to bypass several input checking.
255            Don't use this parameter unless you know what you're doing.
256
257        Returns
258        -------
259        y : ndarray of shape (n_samples,)
260            The predicted classes.
261        """
262        preds = super(NonlinearLogisticRegressor, self).predict(X, id, check_input=check_input, use_parsed_model=use_parsed_model)
263        return self.classes_[(preds > 0.5).astype(int)]

Predict class for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
  • id (int): Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • y (ndarray of shape (n_samples,)): The predicted classes.
def predict_proba(self, X, id=None, check_input=True):
265    def predict_proba(self, X, id=None, check_input=True):
266        """
267        Predict class probabilities for X.
268
269        Parameters
270        ----------
271        X : array-like of shape (n_samples, n_features)
272
273        check_input : bool, default=True
274            Allow to bypass several input checking.
275            Don't use this parameter unless you know what you're doing.
276
277        Returns
278        -------
279        T : ndarray of shape (n_samples, n_classes)
280            The class probabilities of the input samples. The order of the
281            classes corresponds to that in the attribute :term:`classes_`.
282        """
283        preds = super(NonlinearLogisticRegressor, self).predict(X, id, check_input=check_input)
284        proba = numpy.vstack([1 - preds, preds]).T
285        return proba

Predict class probabilities for X.

Parameters
  • X (array-like of shape (n_samples, n_features)):

  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.

Returns
  • T (ndarray of shape (n_samples, n_classes)): The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:classes_.
def set_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_proba_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class SymbolicClassifier(sklearn.multiclass.OneVsRestClassifier):
293class SymbolicClassifier(OneVsRestClassifier):
294    """
295    OVR multiclass symbolic classificator
296
297    Parameters
298    ----------
299    estimator : NonlinearLogisticRegressor
300        Instance of NonlinearLogisticRegressor class.
301    """
302
303    def __init__(self, estimator: NonlinearLogisticRegressor):
304        super().__init__(estimator=estimator)
305
306    def fit(self, X, y):
307        """
308        Fit the symbolic models according to the given training data.
309
310        Parameters
311        ----------
312        X : array-like of shape (n_samples, n_features)
313            Training vector, where `n_samples` is the number of samples and
314            `n_features` is the number of features. Should be in the range [0, 1].
315
316        y : array-like of shape (n_samples,)
317            Target vector relative to X.
318
319        Returns
320        -------
321        self
322            Fitted estimator.
323        """
324
325        super().fit(X, y)
326        return self
327
328    def predict(self, X):
329        """
330        Predict class for X.
331
332        Parameters
333        ----------
334        X : array-like of shape (n_samples, n_features)
335            The input samples.
336
337        Returns
338        -------
339        y : ndarray of shape (n_samples,)
340            The predicted classes.
341        """
342        return super().predict(X)
343
344    def predict_proba(self, X):
345        """
346        Predict class probabilities for X.
347
348        Parameters
349        ----------
350        X : narray-like of shape (n_samples, n_features)
351
352        Returns
353        -------
354        T : ndarray of shape (n_samples, n_classes)
355            The class probabilities of the input samples. The order of the
356            classes corresponds to that in the attribute :term:`classes_`.
357        """
358        return super().predict_proba(X)
359
360    def __sklearn_tags__(self):
361        return super().__sklearn_tags__()

OVR multiclass symbolic classificator

Parameters
  • estimator (NonlinearLogisticRegressor): Instance of NonlinearLogisticRegressor class.
SymbolicClassifier(estimator: NonlinearLogisticRegressor)
303    def __init__(self, estimator: NonlinearLogisticRegressor):
304        super().__init__(estimator=estimator)
def fit(self, X, y):
306    def fit(self, X, y):
307        """
308        Fit the symbolic models according to the given training data.
309
310        Parameters
311        ----------
312        X : array-like of shape (n_samples, n_features)
313            Training vector, where `n_samples` is the number of samples and
314            `n_features` is the number of features. Should be in the range [0, 1].
315
316        y : array-like of shape (n_samples,)
317            Target vector relative to X.
318
319        Returns
320        -------
321        self
322            Fitted estimator.
323        """
324
325        super().fit(X, y)
326        return self

Fit the symbolic models according to the given training data.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features. Should be in the range [0, 1].
  • y (array-like of shape (n_samples,)): Target vector relative to X.
Returns
  • self: Fitted estimator.
def predict(self, X):
328    def predict(self, X):
329        """
330        Predict class for X.
331
332        Parameters
333        ----------
334        X : array-like of shape (n_samples, n_features)
335            The input samples.
336
337        Returns
338        -------
339        y : ndarray of shape (n_samples,)
340            The predicted classes.
341        """
342        return super().predict(X)

Predict class for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
Returns
  • y (ndarray of shape (n_samples,)): The predicted classes.
def predict_proba(self, X):
344    def predict_proba(self, X):
345        """
346        Predict class probabilities for X.
347
348        Parameters
349        ----------
350        X : narray-like of shape (n_samples, n_features)
351
352        Returns
353        -------
354        T : ndarray of shape (n_samples, n_classes)
355            The class probabilities of the input samples. The order of the
356            classes corresponds to that in the attribute :term:`classes_`.
357        """
358        return super().predict_proba(X)

Predict class probabilities for X.

Parameters
  • X (narray-like of shape (n_samples, n_features)):
Returns
  • T (ndarray of shape (n_samples, n_classes)): The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:classes_.
def set_partial_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class FuzzyRegressor(sklearn.base.ClassifierMixin, HROCH.hroch.SymbolicSolver):
 12class FuzzyRegressor(ClassifierMixin, SymbolicSolver):
 13    """
 14    Fuzzy Regressor
 15
 16    Parameters
 17    ----------
 18    num_threads : int, default=1
 19        Number of used threads.
 20
 21    time_limit : float, default=5.0
 22        Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
 23
 24    iter_limit : int, default=0
 25        Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
 26
 27    precision : str, default='f32'
 28        'f64' or 'f32'. Internal floating number representation.
 29
 30    problem : str or dict, default='fuzzy'
 31        Predefined instructions sets 'fuzzy' or custom defines set of instructions with mutation probability.
 32        ```python
 33        problem={'f_and':10.0, 'f_or':10.0, 'f_xor':1.0, 'f_not':1.0, 'nop':1.0}
 34        ```
 35
 36        |**supported instructions**||
 37        |-|-|
 38        |**other**|nop|
 39        |**fuzzy**|f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl|
 40
 41    feature_probs : str or array of shape (n_features,), default='xicor'
 42        The probability that a mutation will select a feature.
 43        If None then the features are selected with equal probability.
 44        If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
 45
 46    random_state : int, default=0
 47        Random generator seed. If 0 then random generator will be initialized by system time.
 48
 49    verbose : int, default=0
 50        Controls the verbosity when fitting and predicting.
 51
 52    metric : str, default='LogLoss'
 53        Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
 54
 55    transformation : str, default=None
 56        Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
 57
 58    algo_settings : dict, default = None
 59        If not defined SymbolicSolver.ALGO_SETTINGS is used.
 60        ```python
 61        algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
 62        ```
 63        - 'neighbours_count' : (int) Number tested neighbours in each iteration
 64        - 'alpha' : (float) Score worsening limit for a iteration
 65        - 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
 66        - 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
 67        - 'sample_size : (int) Number of batches of sample used to calculate the score during training
 68
 69    code_settings : dict, default = None
 70        If not defined SymbolicSolver.CODE_SETTINGS is used.
 71        ```python
 72        code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
 73        ```
 74        - 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
 75        - 'min_size': (int) Minimum allowed equation size(as a linear program).
 76        - 'max_size' : (int) Maximum allowed equation size(as a linear program).
 77
 78    population_settings : dict, default = None
 79        If not defined SymbolicSolver.POPULATION_SETTINGS is used.
 80        ```python
 81        population_settings = {'size': 64, 'tournament':4}
 82        ```
 83        - 'size' : (int) Number of individuals in the population.
 84        - 'tournament' : (int) Tournament selection.
 85
 86    init_const_settings : dict, default = None
 87        If not defined FuzzyRegressor.INIT_CONST_SETTINGS is used.
 88        ```python
 89        init_const_settings = {'const_min':0.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
 90        ```
 91        - 'const_min' : (float) Lower range for initializing constants.
 92        - 'const_max' : (float) Upper range for initializing constants.
 93        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
 94        - 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
 95
 96    const_settings : dict, default = None
 97        If not defined FuzzyRegressor.CONST_SETTINGS is used.
 98        ```python
 99        const_settings = {'const_min':0.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
100        ```
101        - 'const_min' : (float) Lower range for constants used in equations.
102        - 'const_max' : (float) Upper range for constants used in equations.
103        - 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
104        - 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
105
106    target_clip : array, default = None
107        Array of two float values clip_min and clip_max.
108        If not defined SymbolicSolver.CLASSIFICATION_TARGET_CLIP is used.
109        ```python
110        target_clip=[3e-7, 1.0-3e-7]
111        ```
112    class_weight : dict or 'balanced', default=None
113        Weights associated with classes in the form ``{class_label: weight}``.
114        If not given, all classes are supposed to have weight one.
115
116        The "balanced" mode uses the values of y to automatically adjust
117        weights inversely proportional to class frequencies in the input data
118        as ``n_samples / (n_classes * np.bincount(y))``.
119
120        Note that these weights will be multiplied with sample_weight (passed
121        through the fit method) if sample_weight is specified.
122
123    cv_params : dict, default = None
124        If not defined SymbolicSolver.CLASSIFICATION_CV_PARAMS is used.
125        ```python
126        cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(log_loss, greater_is_better=False, response_method='predict_proba')}
127        ```
128        - 'n' : (int) Crossvalidate n top models
129        - 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
130        - select : (str) Best model selection method choose from 'mean'or 'median'
131        - opt_params : (dict) Parameters passed to scipy.optimize.minimize method
132        - opt_metric : (make_scorer) Scoring method
133
134    warm_start : bool, default=False
135        If True, then the solver will be reused for the next call of fit.
136    """
137
138    INIT_CONST_SETTINGS = {
139        "const_min": 0.0,
140        "const_max": 1.0,
141        "predefined_const_prob": 0.0,
142        "predefined_const_set": [],
143    }
144    CONST_SETTINGS = {
145        "const_min": 0.0,
146        "const_max": 1.0,
147        "predefined_const_prob": 0.0,
148        "predefined_const_set": [],
149    }
150
151    def __init__(
152        self,
153        num_threads: int = 1,
154        time_limit: float = 5.0,
155        iter_limit: int = 0,
156        precision: str = "f32",
157        problem="fuzzy",
158        feature_probs="xicor",
159        random_state: int = 0,
160        verbose: int = 0,
161        metric: str = "LogLoss",
162        transformation: str = None,
163        algo_settings=None,
164        code_settings=None,
165        population_settings=None,
166        init_const_settings=None,
167        const_settings=None,
168        target_clip=None,
169        class_weight=None,
170        cv_params=None,
171        warm_start: bool = False,
172    ):
173        super(FuzzyRegressor, self).__init__(
174            num_threads=num_threads,
175            time_limit=time_limit,
176            iter_limit=iter_limit,
177            precision=precision,
178            problem=problem,
179            feature_probs=feature_probs,
180            random_state=random_state,
181            verbose=verbose,
182            metric=metric,
183            algo_settings=algo_settings,
184            transformation=transformation,
185            code_settings=code_settings,
186            population_settings=population_settings,
187            init_const_settings=init_const_settings,
188            const_settings=const_settings,
189            target_clip=target_clip,
190            class_weight=class_weight,
191            cv_params=cv_params,
192            warm_start=warm_start,
193        )
194
195    def fit(self, X, y, sample_weight=None, check_input=True):
196        """
197        Fit the symbolic models according to the given training data.
198
199        Parameters
200        ----------
201        X : array-like of shape (n_samples, n_features)
202            Training vector, where `n_samples` is the number of samples and
203            `n_features` is the number of features. Should be in the range [0, 1].
204
205        y : array-like of shape (n_samples,)
206            Target vector relative to X. Needs samples of 2 classes.
207
208        sample_weight : array-like of shape (n_samples,) default=None
209            Array of weights that are assigned to individual samples.
210            If not provided, then each sample is given unit weight.
211
212        check_input : bool, default=True
213            Allow to bypass several input checking.
214            Don't use this parameter unless you know what you're doing.
215
216        Returns
217        -------
218        self
219            Fitted estimator.
220        """
221        if check_input:
222            X, y = validate_data(self, X, y, accept_sparse=False, y_numeric=False, multi_output=False)
223
224        y_type = type_of_target(y, input_name="y", raise_unknown=True)
225        if y_type != "binary":
226            raise ValueError("Only binary classification is supported. The type of the target " f"is {y_type}.")
227        check_classification_targets(y)
228        enc = LabelEncoder()
229        y_ind = enc.fit_transform(y)
230        self.classes_ = enc.classes_
231        self.n_classes_ = len(self.classes_)
232        if self.n_classes_ != 2:
233            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
234
235        self.class_weight_ = compute_class_weight(self.class_weight, classes=self.classes_, y=y)
236
237        super(FuzzyRegressor, self).fit(X, y_ind, sample_weight=sample_weight, check_input=check_input)
238        return self
239
240    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
241        """
242        Predict class for X.
243
244        Parameters
245        ----------
246        X : array-like of shape (n_samples, n_features)
247            The input samples.
248
249        id : int
250            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
251
252        check_input : bool, default=True
253            Allow to bypass several input checking.
254            Don't use this parameter unless you know what you're doing.
255
256        Returns
257        -------
258        y : ndarray of shape (n_samples,)
259            The predicted classes.
260        """
261        preds = super(FuzzyRegressor, self).predict(X, id, check_input=check_input, use_parsed_model=use_parsed_model)
262        return self.classes_[(preds > 0.5).astype(int)]
263
264    def predict_proba(self, X, id=None, check_input=True):
265        """
266        Predict class probabilities for X.
267
268        Parameters
269        ----------
270        X : array-like of shape (n_samples, n_features)
271
272        check_input : bool, default=True
273            Allow to bypass several input checking.
274            Don't use this parameter unless you know what you're doing.
275
276        Returns
277        -------
278        T : ndarray of shape (n_samples, n_classes)
279            The class probabilities of the input samples. The order of the
280            classes corresponds to that in the attribute :term:`classes_`.
281        """
282        preds = super(FuzzyRegressor, self).predict(X, id, check_input=check_input)
283        proba = numpy.vstack([1 - preds, preds]).T
284        return proba
285
286    def __sklearn_tags__(self):
287        tags = super().__sklearn_tags__()
288        tags.estimator_type = "classifier"
289        tags.classifier_tags = ClassifierTags(multi_class=False, poor_score=True)
290        return tags

Fuzzy Regressor

Parameters
  • num_threads (int, default=1): Number of used threads.
  • time_limit (float, default=5.0): Timeout in seconds. If is set to 0 there is no limit and the algorithm runs until iter_limit is met.
  • iter_limit (int, default=0): Iterations limit. If is set to 0 there is no limit and the algorithm runs until time_limit is met.
  • precision (str, default='f32'): 'f64' or 'f32'. Internal floating number representation.
  • problem (str or dict, default='fuzzy'): Predefined instructions sets 'fuzzy' or custom defines set of instructions with mutation probability.

    problem={'f_and':10.0, 'f_or':10.0, 'f_xor':1.0, 'f_not':1.0, 'nop':1.0}
    
    supported instructions
    other nop
    fuzzy f_and, f_or, f_xor, f_impl, f_not, f_nand, f_nor, f_nxor, f_nimpl
  • feature_probs (str or array of shape (n_features,), default='xicor'): The probability that a mutation will select a feature. If None then the features are selected with equal probability. If 'xicor' then the probabilities are deriveded from xicor corelation coefficient https://doi.org/10.1080/01621459.2020.1758115
  • random_state (int, default=0): Random generator seed. If 0 then random generator will be initialized by system time.
  • verbose (int, default=0): Controls the verbosity when fitting and predicting.
  • metric (str, default='LogLoss'): Metric used for evaluating error. Choose from {'MSE', 'MAE', 'MSLE', 'LogLoss'}
  • transformation (str, default=None): Final transformation for computed value. Choose from { None, 'LOGISTIC', 'ORDINAL'}
  • algo_settings (dict, default = None): If not defined SymbolicSolver.ALGO_SETTINGS is used.

    algo_settings = {'neighbours_count':15, 'alpha':0.15, 'beta':0.5, 'pretest_size':1, 'sample_size':16}
    
    • 'neighbours_count' : (int) Number tested neighbours in each iteration
    • 'alpha' : (float) Score worsening limit for a iteration
    • 'beta' : (float) Tree breadth-wise expanding factor in a range from 0 to 1
    • 'pretest_size' : (int) Batch count(batch is 64 rows sample) for fast fitness preevaluating
    • 'sample_size : (int) Number of batches of sample used to calculate the score during training
  • code_settings (dict, default = None): If not defined SymbolicSolver.CODE_SETTINGS is used.

    code_settings = {'min_size': 32, 'max_size':32, 'const_size':8}
    
    • 'const_size' : (int) Maximum alloved constants in symbolic model, accept also 0.
    • 'min_size': (int) Minimum allowed equation size(as a linear program).
    • 'max_size' : (int) Maximum allowed equation size(as a linear program).
  • population_settings (dict, default = None): If not defined SymbolicSolver.POPULATION_SETTINGS is used.

    population_settings = {'size': 64, 'tournament':4}
    
    • 'size' : (int) Number of individuals in the population.
    • 'tournament' : (int) Tournament selection.
  • init_const_settings (dict, default = None): If not defined FuzzyRegressor.INIT_CONST_SETTINGS is used.

    init_const_settings = {'const_min':0.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for initializing constants.
    • 'const_max' : (float) Upper range for initializing constants.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during initialization.
    • 'predefined_const_set' : (array of floats) Predefined constants used during initialization.
  • const_settings (dict, default = None): If not defined FuzzyRegressor.CONST_SETTINGS is used.

    const_settings = {'const_min':0.0, 'const_max':1.0, 'predefined_const_prob':0.0, 'predefined_const_set': []}
    
    • 'const_min' : (float) Lower range for constants used in equations.
    • 'const_max' : (float) Upper range for constants used in equations.
    • 'predefined_const_prob': (float) Probability of selecting one of the predefined constants during search process(mutation).
    • 'predefined_const_set' : (array of floats) Predefined constants used during search process(mutation).
  • target_clip (array, default = None): Array of two float values clip_min and clip_max. If not defined SymbolicSolver.CLASSIFICATION_TARGET_CLIP is used.

    target_clip=[3e-7, 1.0-3e-7]
    
  • class_weight (dict or 'balanced', default=None): Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

    Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

  • cv_params (dict, default = None): If not defined SymbolicSolver.CLASSIFICATION_CV_PARAMS is used.

    cv_params = {'n':0, 'cv_params':{}, 'select':'mean', 'opt_params':{'method': 'Nelder-Mead'}, 'opt_metric':make_scorer(log_loss, greater_is_better=False, response_method='predict_proba')}
    
    • 'n' : (int) Crossvalidate n top models
    • 'cv_params' : (dict) Parameters passed to scikit-learn cross_validate method
    • select : (str) Best model selection method choose from 'mean'or 'median'
    • opt_params : (dict) Parameters passed to scipy.optimize.minimize method
    • opt_metric : (make_scorer) Scoring method
  • warm_start (bool, default=False): If True, then the solver will be reused for the next call of fit.
FuzzyRegressor( num_threads: int = 1, time_limit: float = 5.0, iter_limit: int = 0, precision: str = 'f32', problem='fuzzy', feature_probs='xicor', random_state: int = 0, verbose: int = 0, metric: str = 'LogLoss', transformation: str = None, algo_settings=None, code_settings=None, population_settings=None, init_const_settings=None, const_settings=None, target_clip=None, class_weight=None, cv_params=None, warm_start: bool = False)
151    def __init__(
152        self,
153        num_threads: int = 1,
154        time_limit: float = 5.0,
155        iter_limit: int = 0,
156        precision: str = "f32",
157        problem="fuzzy",
158        feature_probs="xicor",
159        random_state: int = 0,
160        verbose: int = 0,
161        metric: str = "LogLoss",
162        transformation: str = None,
163        algo_settings=None,
164        code_settings=None,
165        population_settings=None,
166        init_const_settings=None,
167        const_settings=None,
168        target_clip=None,
169        class_weight=None,
170        cv_params=None,
171        warm_start: bool = False,
172    ):
173        super(FuzzyRegressor, self).__init__(
174            num_threads=num_threads,
175            time_limit=time_limit,
176            iter_limit=iter_limit,
177            precision=precision,
178            problem=problem,
179            feature_probs=feature_probs,
180            random_state=random_state,
181            verbose=verbose,
182            metric=metric,
183            algo_settings=algo_settings,
184            transformation=transformation,
185            code_settings=code_settings,
186            population_settings=population_settings,
187            init_const_settings=init_const_settings,
188            const_settings=const_settings,
189            target_clip=target_clip,
190            class_weight=class_weight,
191            cv_params=cv_params,
192            warm_start=warm_start,
193        )
INIT_CONST_SETTINGS = {'const_min': 0.0, 'const_max': 1.0, 'predefined_const_prob': 0.0, 'predefined_const_set': []}
CONST_SETTINGS = {'const_min': 0.0, 'const_max': 1.0, 'predefined_const_prob': 0.0, 'predefined_const_set': []}
def fit(self, X, y, sample_weight=None, check_input=True):
195    def fit(self, X, y, sample_weight=None, check_input=True):
196        """
197        Fit the symbolic models according to the given training data.
198
199        Parameters
200        ----------
201        X : array-like of shape (n_samples, n_features)
202            Training vector, where `n_samples` is the number of samples and
203            `n_features` is the number of features. Should be in the range [0, 1].
204
205        y : array-like of shape (n_samples,)
206            Target vector relative to X. Needs samples of 2 classes.
207
208        sample_weight : array-like of shape (n_samples,) default=None
209            Array of weights that are assigned to individual samples.
210            If not provided, then each sample is given unit weight.
211
212        check_input : bool, default=True
213            Allow to bypass several input checking.
214            Don't use this parameter unless you know what you're doing.
215
216        Returns
217        -------
218        self
219            Fitted estimator.
220        """
221        if check_input:
222            X, y = validate_data(self, X, y, accept_sparse=False, y_numeric=False, multi_output=False)
223
224        y_type = type_of_target(y, input_name="y", raise_unknown=True)
225        if y_type != "binary":
226            raise ValueError("Only binary classification is supported. The type of the target " f"is {y_type}.")
227        check_classification_targets(y)
228        enc = LabelEncoder()
229        y_ind = enc.fit_transform(y)
230        self.classes_ = enc.classes_
231        self.n_classes_ = len(self.classes_)
232        if self.n_classes_ != 2:
233            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
234
235        self.class_weight_ = compute_class_weight(self.class_weight, classes=self.classes_, y=y)
236
237        super(FuzzyRegressor, self).fit(X, y_ind, sample_weight=sample_weight, check_input=check_input)
238        return self

Fit the symbolic models according to the given training data.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features. Should be in the range [0, 1].
  • y (array-like of shape (n_samples,)): Target vector relative to X. Needs samples of 2 classes.
  • sample_weight (array-like of shape (n_samples,) default=None): Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • self: Fitted estimator.
def predict(self, X, id=None, check_input=True, use_parsed_model=True):
240    def predict(self, X, id=None, check_input=True, use_parsed_model=True):
241        """
242        Predict class for X.
243
244        Parameters
245        ----------
246        X : array-like of shape (n_samples, n_features)
247            The input samples.
248
249        id : int
250            Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
251
252        check_input : bool, default=True
253            Allow to bypass several input checking.
254            Don't use this parameter unless you know what you're doing.
255
256        Returns
257        -------
258        y : ndarray of shape (n_samples,)
259            The predicted classes.
260        """
261        preds = super(FuzzyRegressor, self).predict(X, id, check_input=check_input, use_parsed_model=use_parsed_model)
262        return self.classes_[(preds > 0.5).astype(int)]

Predict class for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
  • id (int): Model id, default=None. id can be obtained from get_models method. If its None prediction use the best model.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • y (ndarray of shape (n_samples,)): The predicted classes.
def predict_proba(self, X, id=None, check_input=True):
264    def predict_proba(self, X, id=None, check_input=True):
265        """
266        Predict class probabilities for X.
267
268        Parameters
269        ----------
270        X : array-like of shape (n_samples, n_features)
271
272        check_input : bool, default=True
273            Allow to bypass several input checking.
274            Don't use this parameter unless you know what you're doing.
275
276        Returns
277        -------
278        T : ndarray of shape (n_samples, n_classes)
279            The class probabilities of the input samples. The order of the
280            classes corresponds to that in the attribute :term:`classes_`.
281        """
282        preds = super(FuzzyRegressor, self).predict(X, id, check_input=check_input)
283        proba = numpy.vstack([1 - preds, preds]).T
284        return proba

Predict class probabilities for X.

Parameters
  • X (array-like of shape (n_samples, n_features)):

  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.

Returns
  • T (ndarray of shape (n_samples, n_classes)): The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:classes_.
def set_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_proba_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class FuzzyClassifier(sklearn.multiclass.OneVsRestClassifier):
293class FuzzyClassifier(OneVsRestClassifier):
294    """
295    Fuzzy multiclass symbolic classificator
296
297    Parameters
298    ----------
299    estimator : FuzzyRegressor
300        Instance of FuzzyRegressor class.
301    """
302
303    def __init__(self, estimator: FuzzyRegressor):
304        super().__init__(estimator=estimator)
305
306    def fit(self, X, y):
307        """
308        Fit the symbolic models according to the given training data.
309
310        Parameters
311        ----------
312        X : array-like of shape (n_samples, n_features)
313            Training vector, where `n_samples` is the number of samples and
314            `n_features` is the number of features. Should be in the range [0, 1].
315
316        y : array-like of shape (n_samples,)
317            Target vector relative to X.
318
319        Returns
320        -------
321        self
322            Fitted estimator.
323        """
324
325        super().fit(X, y)
326        return self
327
328    def predict(self, X):
329        """
330        Predict class for X.
331
332        Parameters
333        ----------
334        X : array-like of shape (n_samples, n_features)
335            The input samples.
336
337        Returns
338        -------
339        y : ndarray of shape (n_samples,)
340            The predicted classes.
341        """
342        return super().predict(X)
343
344    def predict_proba(self, X):
345        """
346        Predict class probabilities for X.
347
348        Parameters
349        ----------
350        X : array-like of shape (n_samples, n_features)
351
352        Returns
353        -------
354        T : ndarray of shape (n_samples, n_classes)
355            The class probabilities of the input samples. The order of the
356            classes corresponds to that in the attribute :term:`classes_`.
357        """
358        return super().predict_proba(X)
359
360    def __sklearn_tags__(self):
361        tags = super().__sklearn_tags__()
362        tags.classifier_tags = ClassifierTags(poor_score=True)
363        return tags

Fuzzy multiclass symbolic classificator

Parameters
  • estimator (FuzzyRegressor): Instance of FuzzyRegressor class.
FuzzyClassifier(estimator: FuzzyRegressor)
303    def __init__(self, estimator: FuzzyRegressor):
304        super().__init__(estimator=estimator)
def fit(self, X, y):
306    def fit(self, X, y):
307        """
308        Fit the symbolic models according to the given training data.
309
310        Parameters
311        ----------
312        X : array-like of shape (n_samples, n_features)
313            Training vector, where `n_samples` is the number of samples and
314            `n_features` is the number of features. Should be in the range [0, 1].
315
316        y : array-like of shape (n_samples,)
317            Target vector relative to X.
318
319        Returns
320        -------
321        self
322            Fitted estimator.
323        """
324
325        super().fit(X, y)
326        return self

Fit the symbolic models according to the given training data.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features. Should be in the range [0, 1].
  • y (array-like of shape (n_samples,)): Target vector relative to X.
Returns
  • self: Fitted estimator.
def predict(self, X):
328    def predict(self, X):
329        """
330        Predict class for X.
331
332        Parameters
333        ----------
334        X : array-like of shape (n_samples, n_features)
335            The input samples.
336
337        Returns
338        -------
339        y : ndarray of shape (n_samples,)
340            The predicted classes.
341        """
342        return super().predict(X)

Predict class for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
Returns
  • y (ndarray of shape (n_samples,)): The predicted classes.
def predict_proba(self, X):
344    def predict_proba(self, X):
345        """
346        Predict class probabilities for X.
347
348        Parameters
349        ----------
350        X : array-like of shape (n_samples, n_features)
351
352        Returns
353        -------
354        T : ndarray of shape (n_samples, n_classes)
355            The class probabilities of the input samples. The order of the
356            classes corresponds to that in the attribute :term:`classes_`.
357        """
358        return super().predict_proba(X)

Predict class probabilities for X.

Parameters
  • X (array-like of shape (n_samples, n_features)):
Returns
  • T (ndarray of shape (n_samples, n_classes)): The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:classes_.
def set_partial_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class RegressorMathModel(sklearn.base.RegressorMixin, HROCH.hroch.MathModelBase):
207class RegressorMathModel(RegressorMixin, MathModelBase):
208    """
209    A regressor class for the symbolic model.
210    """
211
212    def __init__(self, m: ParsedMathModel, opt_metric, opt_params, transformation, target_clip) -> None:
213        super().__init__(m, opt_metric, opt_params, transformation, target_clip, None, None)
214
215    def fit(self, X, y, sample_weight=None, check_input=True):
216        """
217        Fit the model according to the given training data.
218
219        That means find a optimal values for constants in a symbolic equation.
220
221        Parameters
222        ----------
223        X : array-like of shape (n_samples, n_features)
224            Training vector, where `n_samples` is the number of samples and
225            `n_features` is the number of features.
226
227        y : array-like of shape (n_samples,)
228            Target vector relative to X.
229
230        sample_weight : array-like of shape (n_samples,) default=None
231            Array of weights that are assigned to individual samples.
232            If not provided, then each sample is given unit weight.
233
234        check_input : bool, default=True
235            Allow to bypass several input checking.
236            Don't use this parameter unless you know what you're doing.
237
238        Returns
239        -------
240        self
241            Fitted estimator.
242        """
243
244        def objective(c):
245            return self.__eval(X, y, metric=self.opt_metric, c=c, sample_weight=sample_weight)
246
247        if len(self.m.coeffs) > 0:
248            result = opt.minimize(objective, self.m.coeffs, **self.opt_params)
249
250            for i in range(len(self.m.coeffs)):
251                self.m.coeffs[i] = result.x[i]
252
253        self.is_fitted_ = True
254        return self
255
256    def predict(self, X, check_input=True):
257        """
258        Predict regression target for X.
259
260        Parameters
261        ----------
262        X : array-like of shape (n_samples, n_features)
263            The input samples.
264
265        check_input : bool, default=True
266            Allow to bypass several input checking.
267            Don't use this parameter unless you know what you're doing.
268
269        Returns
270        -------
271        y : ndarray of shape (n_samples,) or (n_samples, n_outputs)
272            The predicted values.
273        """
274        return self._predict(X)
275
276    def __eval(self, X, y, metric, c=None, sample_weight=None):
277        if c is not None:
278            self.m.coeffs = c
279        try:
280            return -metric(self, X, y, sample_weight=sample_weight)
281        except Exception:
282            return SymbolicSolver.LARGE_FLOAT
283
284    def __str__(self):
285        return f"RegressorMathModel({self.m.str_representation})"
286
287    def __repr__(self):
288        return f"RegressorMathModel({self.m.str_representation})"
289
290    def __sklearn_tags__(self):
291        return super().__sklearn_tags__()

A regressor class for the symbolic model.

RegressorMathModel( m: HROCH.hroch.ParsedMathModel, opt_metric, opt_params, transformation, target_clip)
212    def __init__(self, m: ParsedMathModel, opt_metric, opt_params, transformation, target_clip) -> None:
213        super().__init__(m, opt_metric, opt_params, transformation, target_clip, None, None)
def fit(self, X, y, sample_weight=None, check_input=True):
215    def fit(self, X, y, sample_weight=None, check_input=True):
216        """
217        Fit the model according to the given training data.
218
219        That means find a optimal values for constants in a symbolic equation.
220
221        Parameters
222        ----------
223        X : array-like of shape (n_samples, n_features)
224            Training vector, where `n_samples` is the number of samples and
225            `n_features` is the number of features.
226
227        y : array-like of shape (n_samples,)
228            Target vector relative to X.
229
230        sample_weight : array-like of shape (n_samples,) default=None
231            Array of weights that are assigned to individual samples.
232            If not provided, then each sample is given unit weight.
233
234        check_input : bool, default=True
235            Allow to bypass several input checking.
236            Don't use this parameter unless you know what you're doing.
237
238        Returns
239        -------
240        self
241            Fitted estimator.
242        """
243
244        def objective(c):
245            return self.__eval(X, y, metric=self.opt_metric, c=c, sample_weight=sample_weight)
246
247        if len(self.m.coeffs) > 0:
248            result = opt.minimize(objective, self.m.coeffs, **self.opt_params)
249
250            for i in range(len(self.m.coeffs)):
251                self.m.coeffs[i] = result.x[i]
252
253        self.is_fitted_ = True
254        return self

Fit the model according to the given training data.

That means find a optimal values for constants in a symbolic equation.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like of shape (n_samples,)): Target vector relative to X.
  • sample_weight (array-like of shape (n_samples,) default=None): Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • self: Fitted estimator.
def predict(self, X, check_input=True):
256    def predict(self, X, check_input=True):
257        """
258        Predict regression target for X.
259
260        Parameters
261        ----------
262        X : array-like of shape (n_samples, n_features)
263            The input samples.
264
265        check_input : bool, default=True
266            Allow to bypass several input checking.
267            Don't use this parameter unless you know what you're doing.
268
269        Returns
270        -------
271        y : ndarray of shape (n_samples,) or (n_samples, n_outputs)
272            The predicted values.
273        """
274        return self._predict(X)

Predict regression target for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • y (ndarray of shape (n_samples,) or (n_samples, n_outputs)): The predicted values.
def set_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
class ClassifierMathModel(sklearn.base.ClassifierMixin, HROCH.hroch.MathModelBase):
294class ClassifierMathModel(ClassifierMixin, MathModelBase):
295    """
296    A classifier class for the symbolic model.
297    """
298
299    def __init__(
300        self,
301        m: ParsedMathModel,
302        opt_metric,
303        opt_params,
304        transformation,
305        target_clip,
306        class_weight_,
307        classes_,
308    ) -> None:
309        super().__init__(
310            m,
311            opt_metric,
312            opt_params,
313            transformation,
314            target_clip,
315            class_weight_,
316            classes_,
317        )
318
319    def fit(self, X, y, sample_weight=None, check_input=True):
320        """
321        Fit the model according to the given training data.
322
323        That means find a optimal values for constants in a symbolic equation.
324
325        Parameters
326        ----------
327        X : array-like of shape (n_samples, n_features)
328            Training vector, where `n_samples` is the number of samples and
329            `n_features` is the number of features.
330
331        y : array-like of shape (n_samples,)
332            Target vector relative to X. Needs samples of 2 classes.
333
334        sample_weight : array-like of shape (n_samples,) default=None
335            Array of weights that are assigned to individual samples.
336            If not provided, then each sample is given unit weight.
337
338        check_input : bool, default=True
339            Allow to bypass several input checking.
340            Don't use this parameter unless you know what you're doing.
341
342        Returns
343        -------
344        self
345            Fitted estimator.
346        """
347
348        check_classification_targets(y)
349        enc = LabelEncoder()
350        y_ind = enc.fit_transform(y)
351        self.classes_ = enc.classes_
352        self.n_classes_ = len(self.classes_)
353        if self.n_classes_ != 2:
354            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
355
356        cw = self.class_weight_
357        cw_sample_weight = numpy.array(cw)[y_ind] if len(cw) == 2 and cw[0] != cw[1] else None
358        if sample_weight is None:
359            sample_weight = cw_sample_weight
360        elif cw_sample_weight is not None:
361            sample_weight = sample_weight * cw_sample_weight
362
363        def objective(c):
364            return self.__eval(X, y, metric=self.opt_metric, c=c, sample_weight=sample_weight)
365
366        if len(self.m.coeffs) > 0:
367            result = opt.minimize(objective, self.m.coeffs, **self.opt_params)
368
369            for i in range(len(self.m.coeffs)):
370                self.m.coeffs[i] = result.x[i]
371
372        self.is_fitted_ = True
373        return self
374
375    def predict(self, X, check_input=True):
376        """
377        Predict class for X.
378
379        Parameters
380        ----------
381        X : array-like of shape (n_samples, n_features)
382            The input samples.
383
384        check_input : bool, default=True
385            Allow to bypass several input checking.
386            Don't use this parameter unless you know what you're doing.
387
388        Returns
389        -------
390        y : ndarray of shape (n_samples,)
391            The predicted classes.
392        """
393        preds = self._predict(X, check_input=check_input)
394        return self.classes_[(preds > 0.5).astype(int)]
395
396    def predict_proba(self, X, check_input=True):
397        """
398        Predict class probabilities for X.
399
400        Parameters
401        ----------
402        X : array-like of shape (n_samples, n_features)
403
404        check_input : bool, default=True
405            Allow to bypass several input checking.
406            Don't use this parameter unless you know what you're doing.
407
408        Returns
409        -------
410        T : ndarray of shape (n_samples, n_classes)
411            The class probabilities of the input samples. The order of the
412            classes corresponds to that in the attribute :term:`classes_`.
413        """
414        preds = self._predict(X, check_input=check_input)
415        proba = numpy.vstack([1 - preds, preds]).T
416        return proba
417
418    def __eval(self, X, y, metric, c=None, sample_weight=None):
419        if c is not None:
420            self.m.coeffs = c
421        try:
422            return -metric(self, X, y, sample_weight=sample_weight)
423        except Exception:
424            return SymbolicSolver.LARGE_FLOAT
425
426    def __str__(self):
427        return f"ClassifierMathModel({self.m.str_representation})"
428
429    def __repr__(self):
430        return f"ClassifierMathModel({self.m.str_representation})"
431
432    def __sklearn_tags__(self):
433        return super().__sklearn_tags__()

A classifier class for the symbolic model.

ClassifierMathModel( m: HROCH.hroch.ParsedMathModel, opt_metric, opt_params, transformation, target_clip, class_weight_, classes_)
299    def __init__(
300        self,
301        m: ParsedMathModel,
302        opt_metric,
303        opt_params,
304        transformation,
305        target_clip,
306        class_weight_,
307        classes_,
308    ) -> None:
309        super().__init__(
310            m,
311            opt_metric,
312            opt_params,
313            transformation,
314            target_clip,
315            class_weight_,
316            classes_,
317        )
def fit(self, X, y, sample_weight=None, check_input=True):
319    def fit(self, X, y, sample_weight=None, check_input=True):
320        """
321        Fit the model according to the given training data.
322
323        That means find a optimal values for constants in a symbolic equation.
324
325        Parameters
326        ----------
327        X : array-like of shape (n_samples, n_features)
328            Training vector, where `n_samples` is the number of samples and
329            `n_features` is the number of features.
330
331        y : array-like of shape (n_samples,)
332            Target vector relative to X. Needs samples of 2 classes.
333
334        sample_weight : array-like of shape (n_samples,) default=None
335            Array of weights that are assigned to individual samples.
336            If not provided, then each sample is given unit weight.
337
338        check_input : bool, default=True
339            Allow to bypass several input checking.
340            Don't use this parameter unless you know what you're doing.
341
342        Returns
343        -------
344        self
345            Fitted estimator.
346        """
347
348        check_classification_targets(y)
349        enc = LabelEncoder()
350        y_ind = enc.fit_transform(y)
351        self.classes_ = enc.classes_
352        self.n_classes_ = len(self.classes_)
353        if self.n_classes_ != 2:
354            raise ValueError("This solver needs samples of 2 classes" " in the data, but the data contains" " %r classes" % self.n_classes_)
355
356        cw = self.class_weight_
357        cw_sample_weight = numpy.array(cw)[y_ind] if len(cw) == 2 and cw[0] != cw[1] else None
358        if sample_weight is None:
359            sample_weight = cw_sample_weight
360        elif cw_sample_weight is not None:
361            sample_weight = sample_weight * cw_sample_weight
362
363        def objective(c):
364            return self.__eval(X, y, metric=self.opt_metric, c=c, sample_weight=sample_weight)
365
366        if len(self.m.coeffs) > 0:
367            result = opt.minimize(objective, self.m.coeffs, **self.opt_params)
368
369            for i in range(len(self.m.coeffs)):
370                self.m.coeffs[i] = result.x[i]
371
372        self.is_fitted_ = True
373        return self

Fit the model according to the given training data.

That means find a optimal values for constants in a symbolic equation.

Parameters
  • X (array-like of shape (n_samples, n_features)): Training vector, where n_samples is the number of samples and n_features is the number of features.
  • y (array-like of shape (n_samples,)): Target vector relative to X. Needs samples of 2 classes.
  • sample_weight (array-like of shape (n_samples,) default=None): Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • self: Fitted estimator.
def predict(self, X, check_input=True):
375    def predict(self, X, check_input=True):
376        """
377        Predict class for X.
378
379        Parameters
380        ----------
381        X : array-like of shape (n_samples, n_features)
382            The input samples.
383
384        check_input : bool, default=True
385            Allow to bypass several input checking.
386            Don't use this parameter unless you know what you're doing.
387
388        Returns
389        -------
390        y : ndarray of shape (n_samples,)
391            The predicted classes.
392        """
393        preds = self._predict(X, check_input=check_input)
394        return self.classes_[(preds > 0.5).astype(int)]

Predict class for X.

Parameters
  • X (array-like of shape (n_samples, n_features)): The input samples.
  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.
Returns
  • y (ndarray of shape (n_samples,)): The predicted classes.
def predict_proba(self, X, check_input=True):
396    def predict_proba(self, X, check_input=True):
397        """
398        Predict class probabilities for X.
399
400        Parameters
401        ----------
402        X : array-like of shape (n_samples, n_features)
403
404        check_input : bool, default=True
405            Allow to bypass several input checking.
406            Don't use this parameter unless you know what you're doing.
407
408        Returns
409        -------
410        T : ndarray of shape (n_samples, n_classes)
411            The class probabilities of the input samples. The order of the
412            classes corresponds to that in the attribute :term:`classes_`.
413        """
414        preds = self._predict(X, check_input=check_input)
415        proba = numpy.vstack([1 - preds, preds]).T
416        return proba

Predict class probabilities for X.

Parameters
  • X (array-like of shape (n_samples, n_features)):

  • check_input (bool, default=True): Allow to bypass several input checking. Don't use this parameter unless you know what you're doing.

Returns
  • T (ndarray of shape (n_samples, n_classes)): The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:classes_.
def set_fit_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_predict_proba_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def set_score_request(unknown):

Descriptor for defining set_{method}_request methods in estimators.

New in version 1.3.

Parameters
  • name (str): The name of the method for which the request function should be created, e.g. "fit" would create a set_fit_request function.
  • keys (list of str): A list of strings which are accepted parameters by the created function, e.g. ["sample_weight"] if the corresponding method accepts it as a metadata.
  • validate_keys (bool, default=True): Whether to check if the requested parameters fit the actual parameters of the method.
Notes

This class is a descriptor 1 and uses PEP-362 to set the signature of the returned function 2.

References
def Xicor(X: numpy.ndarray, Y: numpy.ndarray):
521def Xicor(X: numpy.ndarray, Y: numpy.ndarray):
522    """
523    Xicor corelation coefficient.
524
525    This function computes the xi coefficient between two vectors x and y.
526    https://doi.org/10.1080/01621459.2020.1758115
527
528    Parameters
529    ----------
530    X : array-like input vector x
531
532    Y : array-like input vector y
533
534    Returns
535    -------
536    xi : float
537    """
538    if X.ndim != 1:
539        X = numpy.ravel(X)
540    if Y.ndim != 1:
541        Y = numpy.ravel(Y)
542    if len(X) != len(Y):
543        raise ValueError("X and Y must be same size")
544    precision = numpy.float32
545    if X.dtype == Y.dtype and X.dtype == numpy.float64:
546        precision = numpy.float64
547    if not X.flags["C_CONTIGUOUS"] or X.dtype != precision:
548        X = numpy.ascontiguousarray(X.astype(precision))
549    if not Y.flags["C_CONTIGUOUS"] or Y.dtype != precision:
550        Y = numpy.ascontiguousarray(Y.astype(precision))
551    if precision == numpy.float32:
552        return Xicor32(X, Y, len(X))
553    return Xicor64(X, Y, len(X))

Xicor corelation coefficient.

This function computes the xi coefficient between two vectors x and y. https://doi.org/10.1080/01621459.2020.1758115

Parameters
  • X (array-like input vector x):

  • Y (array-like input vector y):

Returns
  • xi (float):
__version__ = '1.4.12'