Spatial Data Modeling in Disposable Income Per Capita in China using Nationwide Spatial Autoregressive ( SAR )

China is a country with an advanced economy, includes in the list of countries which has the biggest export, and China’s economic strength is predicted that will defeat the United States [1]. In 2014, the regional director for GBTA (Global Business Travel Association) reported that China's economic growth also encouraged the tourist’s business sector. Moreover, in the same year, China had made progress (economic) super fast with the value of GDP for 2014 was 28.3-fold rise and per capita rise 19-fold. Revitalization of the Chinese nation to make China's large emerging economies at the center of both worlds. However, with a population of 1.3 billion, China's per capita income is still at number 80 in the world, where 100 million people are still poor and are not in balance between town and country. With the advancement of a famous industry rapid development, China's per capita income was still not balanced between urban and rural areas; It is necessary assessments rely more deeply to solve it because after income per capita is often used as a benchmark for the prosperity of a country. If income per capita is greatest, the country will be judged increasingly affluent. Moreover, China belongs to the part of developed countries [1].


I. Introduction
China is a country with an advanced economy, includes in the list of countries which has the biggest export, and China's economic strength is predicted that will defeat the United States [1].In 2014, the regional director for GBTA (Global Business Travel Association) reported that China's economic growth also encouraged the tourist's business sector.Moreover, in the same year, China had made progress (economic) super fast with the value of GDP for 2014 was 28.3-fold rise and per capita rise 19-fold.Revitalization of the Chinese nation to make China's large emerging economies at the center of both worlds.However, with a population of 1.3 billion, China's per capita income is still at number 80 in the world, where 100 million people are still poor and are not in balance between town and country.With the advancement of a famous industry rapid development, China's per capita income was still not balanced between urban and rural areas; It is necessary assessments rely more deeply to solve it because after income per capita is often used as a benchmark for the prosperity of a country.If income per capita is greatest, the country will be judged increasingly affluent.Moreover, China belongs to the part of developed countries [1].
The understanding of factors that influence the per capita income is certainly very important so that It might be used as reference in decision-making in determining which factors greatly contribute to greater per capita income.It can be used as a strong basis in determining a policy that will be taken and is expected to facilitate the making of a policy so that the various possibilities that may occur regarding loss or weakness can be overcome.Also, tourism industry has a close relationship in advancing the economy in China.Therefore, the researchers aimed to examine how the influence of China as a country became the economic center of the world.However, with a population of 1.3 billion, China's per capita income is still at number 80 in the world.In the world, considering the imbalance between town and country with 100 million people still living in poverty.Thus, to address this imbalance, it is necessary to study the condition in depth, because income per capita is often used as a benchmark to measure the prosperity of a country.With greater and equitable income per capita, the country will be judged increasingly affluent.Two factors, mainly industry and tourism, play an important role in the economic progress in China.These are include Per capita Disposable Income Nationwide (yuan), Total Value of Exports of operating units (1,000 USD), Registered Unemployed Person in Urban Area (10000 person), Foreign Exchange Earning from International tourism(in millions USD) and Number of Overseas Visitor Arrivals (million person/time).Thus, it is necessary to investigate the influence of these factors to increase per capita income.Since the economic development of a region usually affect the surrounding area, this study aims to include spatial effects, using Spatial Autoregressive (SAR) Model.The results suggest that the per capita income affected by the Tourism factor is about 58.65% (Rsquared).
industry and tourism on the economy, particularly its effect on per capita income.The form of analysis is using spatial regression analysis.It is based on the theory advanced by Tobbler that everything is interconnected with each other, but something closer more influence [2].Their spatial effects are a common problem among the regions and especially the regions adjacent to each other [3].With this is expected to be found the significant factors that affect the income per capita the research could found the measurement to improve the welfare of the community equally.

A. Spatial Statistics
Spatial statistics is a statistical method used to analyze spatial data.Spatial data is data that contains information "location," so not only "what" measurable but indicates the location where the data is located.Spatial data may include information regarding the geographic location such as the location of the latitude and longitude of each border region and between regions.Simply put spatial data expressed as the address information.In another form, spatial data is expressed in the form of grid coordinates as in the grain map or the form of pixels as in the form of satellite imagery.Thus the approach of spatial statistical analysis is usually presented in the form of thematic maps [2].

B. Spatial Data Analysis
Spatial data is data that contains the location or geographical information of a region.Spatial analysis leads to many operations and concepts including simple calculations, classifications, structuring, geometric overlap, and cartographic modeling [4]- [6].In general, spatial analysis requires data based on the location and contains the characteristics of the location.Spatial analysis consists of three groups namely visualization, exploration, and modeling.Visualization is to inform the results of spatial analysis.Exploration is to process spatial data with statistical methods.While modeling is showing the existence of the concept of causality by using methods from spatial data sources and nonspatial data to predict the existence of spatial patterns [7], [8].Locations in spatial data should be measured to be aware of any spatial effects that occur.Location information can be identified from two sources [9]: 1. Neighborhood relations.The neighboring relationship reflects the relative location of one spatial unit or location to another in a given space.The neighboring relationships of the spatial units are usually formed on the map.The neighborhood of these spatial units is expected to reflect a high degree of spatial dependence when compared to spatially located units that are located far apart.

Distance (distance)
. Location in a certain space with the latitude and longitude into a source of information.This information is used to calculate the distance between the points contained in space.It is expected that the strength of spatial dependence will decrease according to the distance.

C. Spatial Autocorrelation
Spatial autocorrelation is an estimate of the correlation between the value of observations relating to spatial locations at the same variable.Positive spatial autocorrelation shows the similarity value from adjacent locations and tend to cluster.Negative spatial autocorrelation shows that the adjacent locations have different values and tend to spread [2].Characteristics of spatial autocorrelation expressed by Kosfeld, namely: 1.If there is a systematic pattern in the spatial distribution of observed variables, then there is spatial autocorrelation.
2. If the proximity or adjacency between regions closer, it can be said there is positive spatial autocorrelation.
4. The random pattern of spatial data showed no spatial autocorrelation.
Measurement of spatial autocorrelation for spatial data can be calculated using the Moran's Index (Moran), Geary's C, and Tango's excess.In this study, the analysis method is limited only to the Vol. 3 method of Moran's Index (Moran) [2], [10].This method can be used to detect the onset of spatial randomness.This spatial randomness may indicate clusterization or forming a trend towards space.

D. Spatial Weighted Matrix
Spatial weighted matrix is a matrix that expresses the relationship of the observed region that belongs to  ×  and is denoted by W. The general matrix form of spatial weights (W) shown in (1).The elements of W above are  with  are rows in elements W and  are columns in elements W and are regions around the observation location .element W above can have two values that are zero and one [11].Where the value of  = 1 for the region adjacent to the location of the observation, while the value  = 0 for areas not adjacent to the observation location [12]- [14].In general there are three types of interaction or border crossing area [15]- [17], namely:

Rook Contiguity
Rook contiguity is the contact of one side with the other side of the neighboring area.The value of each element is that if the location  and  are in contact with the side then  = 1.However, if the location  and  are not touching side then  = 0. 2

Bishop Contiguity
Bishop contiguity is the juxtaposition of one region with another neighbor.The value of each element is that if the location  and  touch the vertex then  = 1.However, if the location of  and  is not touching the vertex then  = 0.

Queen Contiguity
Queen contiguity is the contact of the side and corner of the one region with other areas of combined rook contiguity and bishop contiguity.As for the value of each element that is if location  and  touching side or vertex then  = 1.However, if location  and  not touching side or corner point then  = 0.

E. Moran's Index
The theory of spatial autocorrelation is an important element according to investigation process of geographical spatial from different viewpoints [18].Moran's I is a development of the Pearson correlation in the univariate data series.Pearson correlation (ρ) between the predictor variables and the response variable with a lot of data n using formula (2).
where x and y the Pearson correlation equation is an average sample of predictor variables and the response.Ρ value is used to measure whether the predictor variables and the response correlated.
The coefficient of Moran's I used to test the spatial dependency or autocorrelation between observations or location [19]- [21].The test statistic used formula (3) to (10) with the hypothesis for H0 is I = 0 (no autocorrelation between locations), and H1 is I ≠ 0 (autocorrelation between locations). where   .The value of the index I is between -1 and 1.If I> Io, the data has a positive autocorrelation, if I < Io, the data has negative autocorrelation, and Moran index value is zero indicating no groups.Moran index value does not guarantee the accuracy of measurement if the weighting matrix used are not standardized weighting.

F. Spatial Model (SAR)
Spatial Model Autoregressive is a model that combines a simple regression model with spatial lag dependent variable using cross-section data [3], [10].Autoregressive spatial models were formed when W2 = 0 and λ = 0, so that this model assumes that the autoregressive process only on the response variable [10].The general model SAR is shown in (11).,  is bounded variable vector size  × 1, ρ are spatial autocorrelation coefficient on dependent variable,  is a spatial weighted matrix of  ×  size,  for free variable matrix size  × ( + 1), β represent a vector of regression coefficient parameters measuring  × 1, and ε is a vector error free autocorrelation size  × 1.While estimation of β parameter in Spatial Autoregressive Model obtained by using likelihood maximum method is as in (12).

𝑦 = 𝜌𝑊₁𝑦 + 𝑋𝛽 + 𝜀
This model is the development of the first order autoregressive model, where the response variable in addition affected by the lag response variable itself is also influenced by the predictor variables.Autoregressive process also has similarities with the analysis of the time series as the first order autoregressive spatial models.

III. Methods
The analytical method used is the method of spatial regression analysis, namely Spatial Autoregressive Model (SAR).The method of research is done using algorithms (Fig. 1).

A. Data Exploration using Thematic Map
Before analyzing the Moran's Index, The research aim to explore the thematic map of the variables.This is to see whether there was a spatial pattern or no.Fig 2 are the results of the thematic map on each variable.It shows that there was a spatial correlation pattern between regions.It is indicated that was a positive spatial correlation between regions because the same colors are closed each other this notify that same value are closed each other.These diagnostics are not certainly true before the research continued to calculate the Moran's Index between the variables.The next paragraph explains this hypothesis.

Fig. 1 .
Fig. 1.Flowchart of Research Method The data used secondary data, the existing data on the website of the National Bureau of Statistics of China in 2014.The variables used are Per Capita Disposable Income Nationwide (yuan), Foreign Exchange Earnings from International Tourism (USD million), Total Value of Exports of operating units (1,000 US dollars), Registered Unemployed Persons in Urban Area (10000 persons), Number of Overseas Visitor Arrivals (million person-times), Number of Industrial Enterprises above Designated Size (unit).