mgwr.sel_bw.Sel_BW

class mgwr.sel_bw.Sel_BW(coords, y, X_loc, X_glob=None, family=<spglm.family.Gaussian object>, offset=None, kernel='bisquare', fixed=False, multi=False, constant=True, spherical=False)[source]

Select bandwidth for kernel

Methods: p211 - p213, bandwidth selection Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: the analysis of spatially varying relationships.

Parameters:
y : array

n*1, dependent variable.

X_glob : array

n*k1, fixed independent variable.

X_loc : array

n*k2, local independent variable, including constant.

coords : list of tuples

(x,y) of points used in bandwidth selection

family : string

GWR model type: ‘Gaussian’, ‘logistic, ‘Poisson’‘

offset : array

n*1, the offset variable at the ith location. For Poisson model this term is often the size of the population at risk or the expected size of the outcome in spatial epidemiology Default is None where Ni becomes 1.0 for all locations

kernel : string

kernel function: ‘gaussian’, ‘bisquare’, ‘exponetial’

fixed : boolean

True for fixed bandwidth and False for adaptive (NN)

multi : True for multiple (covaraite-specific) bandwidths

False for a traditional (same for all covariates) bandwdith; defualt is False.

constant : boolean

True to include intercept (default) in model and False to exclude intercept.

spherical : boolean

True for shperical coordinates (long-lat), False for projected coordinates (defalut).

Examples

>>> import libpysal as ps
>>> from mgwr.sel_bw import Sel_BW
>>> data = ps.io.open(ps.examples.get_path('GData_utm.csv'))
>>> coords = list(zip(data.by_col('X'), data.by_col('Y')))
>>> y = np.array(data.by_col('PctBach')).reshape((-1,1))
>>> rural = np.array(data.by_col('PctRural')).reshape((-1,1))
>>> pov = np.array(data.by_col('PctPov')).reshape((-1,1))
>>> african_amer = np.array(data.by_col('PctBlack')).reshape((-1,1))
>>> X = np.hstack([rural, pov, african_amer])

Golden section search AICc - adaptive bisquare

>>> bw = Sel_BW(coords, y, X).search(criterion='AICc')
>>> print(bw)
93.0

Golden section search AIC - adaptive Gaussian

>>> bw = Sel_BW(coords, y, X, kernel='gaussian').search(criterion='AIC')
>>> print(bw)
50.0

Golden section search BIC - adaptive Gaussian

>>> bw = Sel_BW(coords, y, X, kernel='gaussian').search(criterion='BIC')
>>> print(bw)
62.0

Golden section search CV - adaptive Gaussian

>>> bw = Sel_BW(coords, y, X, kernel='gaussian').search(criterion='CV')
>>> print(bw)
68.0

Interval AICc - fixed bisquare

>>> sel = Sel_BW(coords, y, X, fixed=True)
>>> bw = sel.search(search_method='interval', bw_min=211001.0, bw_max=211035.0, interval=2)
>>> print(bw)
211025.0
Attributes:
y : array

n*1, dependent variable.

X_glob : array

n*k1, fixed independent variable.

X_loc : array

n*k2, local independent variable, including constant.

coords : list of tuples

(x,y) of points used in bandwidth selection

family : string

GWR model type: ‘Gaussian’, ‘logistic, ‘Poisson’‘

kernel : string

type of kernel used and wether fixed or adaptive

fixed : boolean

True for fixed bandwidth and False for adaptive (NN)

criterion : string

bw selection criterion: ‘AICc’, ‘AIC’, ‘BIC’, ‘CV’

search_method : string

bw search method: ‘golden’, ‘interval’

bw_min : float

min value used in bandwidth search

bw_max : float

max value used in bandwidth search

interval : float

interval increment used in interval search

tol : float

tolerance used to determine convergence

max_iter : integer

max interations if no convergence to tol

multi : True for multiple (covaraite-specific) bandwidths

False for a traditional (same for all covariates) bandwdith; defualt is False.

constant : boolean

True to include intercept (default) in model and False to exclude intercept.

offset : array

n*1, the offset variable at the ith location. For Poisson model this term is often the size of the population at risk or the expected size of the outcome in spatial epidemiology Default is None where Ni becomes 1.0 for all locations

dmat : array

n*n, distance matrix between calibration locations used to compute weight matrix

sorted_dmat : array

n*n, sorted distance matrix between calibration locations used to compute weight matrix. Will be None for fixed bandwidths

spherical : boolean

True for shperical coordinates (long-lat), False for projected coordinates (defalut).

search_params : dict

stores search arguments

int_score : boolan

True if adaptive bandwidth is being used and bandwdith selection should be discrete. False if fixed bandwidth is being used and bandwidth does not have to be discrete.

bw : scalar or array-like

Derived optimal bandwidth(s). Will be a scalar for GWR (multi=False) and a list of scalars for MGWR (multi=True) with one bandwidth for each covariate.

S : array

n*n, hat matrix derived from the iterative backfitting algorthim for MGWR during bandwidth selection

R : array

n*n*k, partial hat matrices derived from the iterative backfitting algoruthm for MGWR during bandwidth selection. There is one n*n matrix for each of the k covariates.

params : array

n*k, calibrated parameter estimates for MGWR based on the iterative backfitting algorithm - computed and saved here to avoid having to do it again in the MGWR object.

Methods

search([search_method, criterion, bw_min, …]) Method to select one unique bandwidth for a gwr model or a bandwidth vector for a mgwr model.
__init__(coords, y, X_loc, X_glob=None, family=<spglm.family.Gaussian object>, offset=None, kernel='bisquare', fixed=False, multi=False, constant=True, spherical=False)[source]

Initialize self. See help(type(self)) for accurate signature.