RawKNNRegressor

sknnr.RawKNNRegressor ¶

Bases: DFIndexCrosswalkMixin, IndependentPredictorMixin, KNeighborsRegressor

Subclass of sklearn.neighbors.KNeighborsRegressor to support independent prediction and scoring and crosswalk array indices to dataframe indexes.

See sklearn.neighbors.KNeighborsRegressor for more information on available parameters for k-neighbors regression used in instantiation.

Parameters:

Name	Type	Description	Default
`n_neighbors`	`int`	Number of neighbors to use by default for `kneighbors` queries.	`5`
`weights`	`('uniform', 'distance')`	Weight function used in prediction.	`'uniform'`
`algorithm`	`('auto', 'ball_tree', 'kd_tree', 'brute')`	Algorithm used to compute the nearest neighbors.	`'auto'`
`leaf_size`	`int`	Leaf size passed to `BallTree` or `KDTree`.	`30`
`p`	`int`	Power parameter for the Minkowski metric.	`2`
`metric`	`str or callable`	The distance metric to use for the tree, calculated in standardized Euclidean space.	`'minkowski'`
`metric_params`	`dict`	Additional keyword arguments for the metric function.	`None`
`n_jobs`	`int`	The number of parallel jobs to run for neighbors search. `None` means 1 unless in a `joblib.parallel_backend` context. `-1` means using all processors.	required

Attributes:

Name	Type	Description
`DISTANCE_PRECISION_DECIMALS`	`int, class attribute`	Number of decimal places used when rounding scaled distances to ensure deterministic neighbor ordering. Default is 10.
`effective_metric_`	`str`	The distance metric to use. It will be same as the metric parameter or a synonym of it, e.g. 'euclidean' if the metric parameter set to 'minkowski' and `p` parameter set to 2.
`effective_metric_params_`	`dict`	Additional keyword arguments for the metric function. For most metrics will be same with `metric_params` parameter, but may also contain the `p` parameter value if the `effective_metric_` attribute is set to 'minkowski'.
`independent_prediction_`	`array-like of shape (n_samples, n_outputs)`	The independent predictions for each sample in the training set, obtained by calculating `kneighbors` on the training data itself and calculating predictions based on those neighbors.
`independent_score_`	`float`	The independent score (i.e. coefficient of determination or R²) for the model, obtained by calculating the average R² across all outputs.
`n_features_in_`	`int`	Number of features seen during `fit`.
`n_samples_fit_`	`int`	Number of samples in the fitted data.

Attributes¶

DISTANCE_PRECISION_DECIMALS `class-attribute` `instance-attribute` ¶

DISTANCE_PRECISION_DECIMALS = 10

Functions¶

fit ¶

fit(X: DataLike, y: DataLike) -> Self

Override fit to set attributes using mixins.

Source code in src/sknnr/_base.py

def fit(self, X: DataLike, y: DataLike) -> Self:
    """Override fit to set attributes using mixins."""
    self._set_dataframe_index_in(X)
    self = super().fit(X, y)
    self._set_independent_prediction_attributes(y)
    return self

kneighbors ¶

kneighbors(X: DataLike | None = None, n_neighbors: int | None = None, return_distance: bool = True, return_dataframe_index: bool = False, use_deterministic_ordering: bool = True) -> NDArray[int64] | tuple[NDArray[float64], NDArray[int64]]

Find the K-neighbors of a point or points in the dataset and optionally return dataframe indexes rather than array indices when the model was fitted with a dataframe.