Periódico de Acesso Aberto
4.8
Calculated on 05 May, 2025
0.31
Powered by scimagojr.com
Informações do autor
Informações do autor
Informações do autor
Clustering, an unsupervised machine learning technique, categorizes objects into groups based on shared characteristics. When applied to spatial data, the assumption of independence is often violated due to similarities among adjacent regions—a phenomenon known as spatial autocorrelation. To address this, spatial clustering incorporates both non-spatial attributes (e.g., socio-economic indicators) and spatial attributes (e.g., geographic location), with spatial attributes weighted based on their influence in defining clusters. In regional economic development, creating clusters that are both spatially coherent and socio-economically homogeneous is critical for effective policy design. Strong interactions among neighboring regions can promote more integrated and balanced growth. This study proposes a spatial clustering framework that optimizes spatial attribute weighting according to the degree of spatial autocorrelation. A simulation study using 2023 data from East Java’s 38 regencies/municipalities determines optimal weights under varying spatial dependence levels. The results show that optimal spatial weights increase with the number of clusters and vary according to the strength of spatial autocorrelation. Applied to East Java, the method produced clusters with higher socio-economic homogeneity than official zones, though with reduced spatial contiguity. These findings highlight the importance of adaptive, autocorrelation-aware clustering to improve regional planning and support more evidence-based development strategies.
[1] R. A. Johnson and D. W. Wichern. (2002). "Applied Multivariate Statistical Analysis". Prentice Hall, Upper Saddle River, NJ.
[2] E. Kolatch. (2001). "Clustering Algorithms for Spatial Databases: A Survey".
[3] Q. Liu, M. Deng, Y. Shi, and J. Wang. (2012). "A Density-Based Spatial Clustering Algorithm Considering Both Spatial Proximity and Attribute Similarity". Computers & Geosciences. 46 : 296-309. 10.1016/j.cageo.2011.12.017.
DOI: https://doi.org/10.1016/j.cageo.2011.12.017[4] A. Peeters, M. Zude, J. Käthner, M. Ünlü, R. Kanber, A. Hetzroni, R. Gebbers, and A. Ben-Gal. (2015). "Getis-Ord’s Hot- and Cold-Spot Statistics as a Basis for Multivariate Spatial Clustering of Orchard Tree Data". Computers and Electronics in Agriculture. 111 : 140-150. 10.1016/j.compag.2014.12.011.
DOI: https://doi.org/10.1016/j.compag.2014.12.011[5] N. Yu, M. De Jong, S. Storm, and J. Mi. (2012). "Transport Infrastructure, Spatial Clusters and Regional Economic Growth in China". Transport Reviews. 32 (1): 3-28. 10.1080/01441647.2011.603104.
DOI: https://doi.org/10.1080/01441647.2011.603104[6] T. E. Carpenter. (2001). "Methods to Investigate Spatial and Temporal Clustering in Veterinary Epidemiology". Preventive Veterinary Medicine. 48 (4): 303-320. 10.1016/S0167-5877(00)00199-9.
DOI: https://doi.org/10.1016/S0167-5877(00)00199-9[7] S. Wang and J. Wu. (2020). "Spatial Heterogeneity of the Associations of Economic and Health Care Factors with Infant Mortality in China Using Geographically Weighted Regression and Spatial Clustering". Social Science & Medicine. 263 : 113287. 10.1016/j.socscimed.2020.113287.
DOI: https://doi.org/10.1016/j.socscimed.2020.113287[8] S. Landau, M. Leese, D. Stahl, and B. S. Everitt. (2011). "Cluster Analysis". John Wiley & Sons.
[9] L. Kaufman and P. J. Rousseeuw. (2009). "Finding Groups in Data: An Introduction to Cluster Analysis". John Wiley & Sons.
[10] S. Openshaw. (1973). "A Regionalisation Program for Large Data Sets". Computer Applications. 3 (4): 136-147.
[11] S. Openshaw, P. J. Taylor, and N. Wrigley. (1979). "Statistical Applications in the Spatial Sciences". Pion, London. 127-144.
[12] A. T. Murray and T. K. Shyy. (2000). "Integrating Attribute and Space Characteristics in Choropleth Display and Spatial Data Mining". International Journal of Geographical Information Science. 14 (7): 649-667. 10.1080/136588100424954.
DOI: https://doi.org/10.1080/136588100424954[13] R. Webster and P. A. Burrough. (1972). "Computer-Based Soil Mapping of Small Areas from Sample Data: I. Multivariate Classification and Ordination". Journal of Soil Science. 23 (2): 210-221. 10.1111/j.1365-2389.1972.tb01654.x.
DOI: https://doi.org/10.1111/j.1365-2389.1972.tb01654.x[14] A. T. Murray and T. H. Grubesic. (2002). "Identifying Non-Hierarchical Spatial Clusters". International Journal of Industrial Engineering. 9 : 86-95.
[15] J. C. Duque, R. Ramos, and J. Suriñach. (2007). "Supervised Regionalization Methods: A Survey". International Regional Science Review. 30 (3): 195-220. 10.1177/0160017607301605.
DOI: https://doi.org/10.1177/0160017607301605[16] J. C. Duque, L. Anselin, and S. J. Rey. (2012). "The Max-p-Regions Problem". Journal of Regional Science. 52 (3): 397-419. 10.1111/j.1467-9787.2011.00743.x.
DOI: https://doi.org/10.1111/j.1467-9787.2011.00743.x[17] J. C. Duque, R. L. Church, and R. S. Middleton. (2011). "The p-Regions Problem". Geographical Analysis. 43 (1): 104-126. 10.1111/j.1538-4632.2010.00810.x.
DOI: https://doi.org/10.1111/j.1538-4632.2010.00810.x[18] L. Anselin. (1988). "Spatial Econometrics: Methods and Models". Kluwer Academic Publishers, Dordrecht; Boston. 10.1007/978-94-015-7799-1.
DOI: https://doi.org/10.1007/978-94-015-7799-1[19] A. D. Cliff and J. K. Ord. (1972). "Testing for Spatial Autocorrelation among Regression Residuals". Geographical Analysis. 4 (3): 267-284. 10.1111/j.1538-4632.1972.tb00475.x.
DOI: https://doi.org/10.1111/j.1538-4632.1972.tb00475.x[20] L. Anselin. (1995). "Local Indicators of Spatial Association (LISA)". Geographical Analysis. 27 (2): 93-115. 10.1111/j.1538-4632.1995.tb00338.x.
DOI: https://doi.org/10.1111/j.1538-4632.1995.tb00338.x[21] D. Stojanova, M. Ceci, A. Appice, D. Malerba, and S. Džeroski. (2013). "Dealing with Spatial Autocorrelation When Learning Predictive Clustering Trees". Ecological Informatics. 13 : 22-39. 10.1016/j.ecoinf.2012.10.006.
DOI: https://doi.org/10.1016/j.ecoinf.2012.10.006[22] J. P. LeSage and R. K. Pace.(2009)." Introduction to Spatial Econometrics". CRC Press, Boca Raton, FL. 10.1201/9781420064254.
DOI: https://doi.org/10.1201/9781420064254[23] J. Z. Huang, M. K. Ng, H. Rong, and Z. Li. (2005). "Automated Variable Weighting in K-Means Type Clustering". IEEE Transactions on Pattern Analysis and Machine Intelligence. 27 (5): 657-668. 10.1109/TPAMI.2005.95.
DOI: https://doi.org/10.1109/TPAMI.2005.95[24] Y. Liu, Z. Li, H. Xiong, X. Gao, and J. Wu. (2010). "Understanding of Internal Clustering Validation Measures". Proceedings of the IEEE International Conference on Data Mining. 911-916. 10.1109/ICDM.2010.35
DOI: https://doi.org/10.1109/ICDM.2010.35[25] A. Jain and D. Zongker. (2002). "Feature Selection: Evaluation, Application, and Small Sample Performance". IEEE Transactions on Pattern Analysis and Machine Intelligence. 19 (2): 153-158. 10.1109/34.574797.
DOI: https://doi.org/10.1109/34.574797[26] C. Y. Tsai and C. C. Chiu. (2004). "A Purchase-Based Market Segmentation Methodology". Expert Systems with Applications. 27 (2): 265-276. 10.1016/j.eswa.2004.02.005.
DOI: https://doi.org/10.1016/j.eswa.2004.02.005[27] K. Grassi, É. Poisson-Caillault, A. Bigand, and A. Lefebvre. (2020). "Comparative Study of Clustering Approaches Applied to Spatial or Temporal Pattern Discovery". Journal of Marine Science and Engineering. 8 (9): 713. 10.3390/jmse8090713.
DOI: https://doi.org/10.3390/jmse8090713[28] S. I. Watson. (2022). "Efficient Design of Geographically Defined Clusters with Spatial Autocorrelation". Journal of Applied Statistics. 49 (13): 3300-3318. 10.1080/02664763.2021.1941807.
DOI: https://doi.org/10.1080/02664763.2021.1941807[29] M. Chavent, V. Kuentz-Simonet, A. Labenne, and J. Saracco. (2018). "ClustGeo: An R Package for Hierarchical Clustering with Spatial Constraints". Computational Statistics. 33 (4): 1799-1822. 10.1007/s00180-018-0791-1.
DOI: https://doi.org/10.1007/s00180-018-0791-1[30] A. D. Cliff and J. K. Ord. (1981). "Spatial Processes: Models and Applications". Pion Limited, London.
[31] A. Wicht, P. Kropp, and B. Schwengler. (2020). "Are Functional Regions More Homogeneous than Administrative Regions?". Papers in Regional Science. 99 (1): 135-165. 10.1111/pirs.12471.
DOI: https://doi.org/10.1111/pirs.12471[32] C. Fang, L. Zhou, X. Gu, X. Liu, and M. Werner. (2025). "A Data-Driven Approach to Urban Area Delineation Using Multi-Source Geospatial Data". Scientific Reports. 15 (1): 8708. 10.1038/s41598-025-93366-x.
DOI: https://doi.org/10.1038/s41598-025-93366-x