Big data GIS and its application

Author : xuzhiping   2023-02-17 15:14:02 Browse: 939

Abstract: Big data GIS is a transformation of GIS from traditional to big data era under the tide of big data. Big data GIS can provide mor...

Big data GIS is a transformation of GIS from traditional to big data era under the tide of big data. Big data GIS can provide more advanced theoretical methods and software platforms for the storage, analysis and visualization of spatial big data, promote the industrial upgrading of traditional GIS, provide new channels and impetus for the development of geographic information industry, and serve the development and deployment of big data industry during the "13th Five-Year Plan" period in China. This paper will analyze the emergence of big data GIS and its application in relevant industries.

Generation of big data GIS

Big data

In recent years, the word Big Data has been mentioned more and more. People use it to describe and define the massive data generated in the era of information explosion, and name the related technological development and innovation.

It is generally believed that big data has the characteristics of large volume, rapid change, multiple types and low value density. The fundamental difference between big data and pure massive data is that big data is the data that can be obtained automatically with the development of the Internet, mobile Internet, Internet of Things and other high-tech technologies, such as mobile phone signaling data, navigation and positioning data, e-commerce transaction data, search engine data, social media data, bus swiping data, etc. We can analyze and mine valuable information and rules from these data, thus helping us to make decisions in various industries and even predict the future.

Spatial big data

It is often said in the industry that 80% of data in daily life is related to spatial location. In the field of big data, because the data mainly comes from the automatically collected data such as the Internet, mobile Internet and the Internet of Things, the proportion of data with spatial location is higher. For example, the cell phone signaling data is generated by the signaling link between the communication base station and the cell phone, and the cell phone position can be calculated by the relative relationship between the cell phone and the base station; In social media data, text, pictures, videos, etc. shared by users are usually marked with location information obtained from user terminals; Bus swiping data can obtain location information from vehicle positioning system; Even e-commerce transaction data can obtain its approximate location information from the IP address.

In general, spatial big data is data with (or implied) spatial location in big data. Due to the particularity of the acquisition method, the spatial big data is different from the classic massive spatial data. The spatial big data has the characteristics of low value density of big data. Before the development of big data technology, it can not be processed by conventional means, and it can not effectively analyze and mine the value of these data.

With the development of big data technology, it is possible to exploit the value of spatial big data. The mining of spatial big data allows us to explore the laws and trends in big data from a new perspective, namely, the spatial position relationship and the space-time change, thus opening another window of big data application.

Big data GIS

Many practical IT technologies have emerged in the field of big data, such as distributed file system, distributed database, distributed computing framework, stream processing framework, etc. These technologies enable us to use ordinary machines to process and mine big data, but most of them focus on the general non-spatial data field and lack the professional analysis ability of spatial data. However, due to the limitation of its IT technology framework, traditional GIS can not well meet the technical requirements of big data for distributed storage and computing, stream data processing, etc.

Big data GIS is to deeply integrate big data technology and GIS technology, embed the core capabilities of GIS into the basic framework of big data, and create a complete big data GIS technology system. The core technology of big data GIS is shown below:

1.Distributed technology

(1) Distributed storage of spatial data. In the original distributed storage system, technologies such as distributed spatial index, fragmentation processing and management of spatial data are embedded. Through the horizontal expansion of spatial data (Scale-Out), the storage and management of spatial data with a single table of more than one billion, even billions, is realized. Common distributed storage systems include HDFS, HBase, Elasticsearch, etc.

(2) Distributed spatial computing. Based on the Spark distributed computing framework, the original geospatial analysis algorithm is transformed in a distributed way to achieve the spatial analysis and calculation between hundreds of millions of spatial objects that cannot be completed by the original GIS in a few hours.

(3) Distributed map rendering. Through vector pyramid, distributed rendering, automatic caching and front-end progressive loading, the "slice-free" rendering effect of large-scale spatial data is achieved (for details, please click "Hypermap High-performance Distributed Map Rendering Technology Decryption and Application" to view).

2. Real-time processing technology of stream data

Based on the basic capabilities of the Spark Streaming streaming computing framework, it expands and realizes the real-time access, filtering, conversion, calculation, visualization and output of streaming data.

3. Spatial big data visualization technology

Unlike traditional GIS, which directly maps all ground objects, big data is often tens of millions or hundreds of millions of data. It is neither necessary nor possible to directly display such a large amount of data. The visualization of spatial big data emphasizes more on expressing its spatial distribution, aggregation degree and connection relationship after analyzing and calculating the data.

In general, big data GIS mainly solves two problems:

New data: Big Data GIS extends the boundaries of spatial data managed by GIS. In addition to classic basic spatial data such as vector and grid, Big Data GIS can also manage real-time stream data and archived spatial big data, which also provides an effective tool for the mining and application of spatial big data.

New technology: Big data GIS has also expanded the technical boundary of traditional GIS. Through the integration with big data IT technology, it has greatly improved the storage capacity, computing performance and rendering ability of GIS for large-scale spatial data.

However, it is not enough to just conquer the big data GIS technology. In order to truly serve the society well, the more important thing is how to provide multiple thinking and multiple decision-making for the relevant businesses of various industries through big data GIS, so as to cater to the impact of new technologies for the industry and provide a solid technical foundation for the development of the industry.

Application of big data GIS

The application of big data GIS in the industry can be called "two-wheel drive", that is, data drive and business drive.

The so-called "data-driven" means that in big data applications, effective data sources should be considered first, and many data collectors can provide data value-added services for more industries besides supporting their own businesses. The most typical example is the mobile phone signaling data obtained by communication operators. In addition to analyzing the rationality of base stations and service outlets, these data can also be used to analyze the distribution and location changes of the population, providing very broad application value for planning, population management, public security and other industries.

"Business driven" refers to the business needs of many industries, which must be carried out before big data is available. However, due to the limitation of data, there are many problems such as insufficient efficiency, large granularity, long feedback cycle, etc. The adoption of big data can effectively solve these problems. For example, in the case of commercial site selection, we can only conduct field survey or issue questionnaires. By using spatial big data GIS technology, we can quickly know the distribution of floating population, and overlay the data of existing hotels. It is easy to find out where there are too many hotels and where they are not enough to meet the needs, so as to guide us in the next step of hotel site selection. It is also applicable to urban planning, public safety, traffic congestion and other work.

The "two-wheel drive" of data and business will promote the application of big data GIS in the industry. However, the specific problems and solutions within each industry will vary. Let's take the natural resources field, urban planning, public security industry, and urban comprehensive management field as examples to briefly explain.

Natural resources

In April 2018, the former Ministry of Land and Resources, the State Oceanic Administration, the National Bureau of Surveying and Mapping and Geographic Information and other relevant departments were integrated to form the Ministry of Natural Resources, whose responsibilities involved land, ocean, surveying and mapping, real estate registration and other aspects.

In the field of natural resources, the accumulated data stock and the increasing data increment make the data volume develop from GB and TB to PB. It is difficult to effectively manage with traditional GIS. For example, the real estate registration business is carried out in various districts and counties, but needs to be integrated at the ministerial level to build a national real estate database, with more than 500 million pieces of spatial data in a single table; For another example, due to the accumulation of historical data, a provincial geographical census database has as much as 410TB of data, and is still increasing. The traditional relational database storage technology based on single-node mode is not competent for this task.

At the same time, the time spent in traditional spatial analysis operations will increase with the growth of data volume. Some more complex spatial operations will also increase exponentially with the growth of data volume, that is, if the data volume doubles, the processing time will increase several times. Taking spatial connection as an example, the spatial connection of 100000 objects takes about 0.7 minutes, that of millions of objects takes about 5.6 minutes, and that of tens of millions of objects suddenly increases to 97 minutes. For the spatial connection of billions of data, the traditional GIS can not get the results at all. It can only decompose the data manually according to the region, then calculate the data in pieces, and finally merge, which is time-consuming and laborious, and the accuracy of the results can not be guaranteed.

When publishing and browsing spatial data, in order to improve the efficiency of map browsing, people generally adopt the technical route of pre-slice. It often takes days or even weeks to cut the data at the national level to level 18, which can not meet the requirements of fast online data, and can not meet the requirements of real-time map browsing performance without slicing.

The application of big data GIS in the field of natural resources will well solve the above pain points. Distributed storage technology can easily manage billions or even billions of spatial objects in a single table, and has almost unlimited horizontal scalability; Distributed spatial analysis greatly reduces the time spent in spatial computation, enabling the full amount of overlay analysis between hundreds of millions of objects to be completed within one hour; With high-performance distributed map rendering technology, data can be "slice-free" published and browsed by combining distributed storage technology and importing data into distributed spatial database. For example, the spatiotemporal big data analysis system based on the distributed architecture of Sichuan Provincial Bureau of Surveying and Mapping, the spatiotemporal big data basic support software, realizes the rapid visualization of tens of millions of vegetation cover layers.

Natural resources

Figure 1:Big data core GIS technology.

Urban planning

Urban planning is a typical business-driven application of big data GIS industry. Before big data GIS, the data that urban planning relies on are often poor in timeliness and coarse in granularity, and often can only be "slapped in the head". With the help of big data GIS, we can know the real and real-time urban operation features such as population distribution, occupation and residence relationship, and provide comprehensive perspective space and quantitative basis for planners to formulate plans. Planners can interpret the relationship between occupation and residence from the perspective of population, employment, post, land use, public service, transportation, commuting and leisure, rather than just limited to the "occupation and residence balance" index. As shown in Figure 1, the sketch display system of Shanghai urban space unit can analyze the commuting relationship of the city and formulate improvement measures by using the relationship between the employment plot and the residential plot.

In addition, with the help of the spatial visualization technology of big data, various planning results can be gathered on a map, which can be clearly extracted and viewed, and the relationship between multiple planning behaviors can be clearly understood. While providing basic data, it can also provide various effective business thematic data to assist in planning. For example, by showing the bus card swiping routes and the card swiping situation at stations, and combining with other information such as population distribution, we can analyze whether the bus line planning is reasonable and where the stations need to be added, and provide decision support for urban planning, as shown in Figure 2.

Instantaneous population density reflects occupational and residential characteristics

Figure 2:Instantaneous population density reflects occupational and residential characteristics.

Public security industry

Public security industry data includes basic geographic data, 3D model data, and rich public security thematic data, such as police cars, police officers, cameras, public security agencies, key areas, control points and other information (these are mostly real-time data). In the public security business, real-time monitoring of location-based moving targets is often required. In the process of data receiving, the real-time position calculation function should also be realized. The archiving, calculation and visualization of massive dynamic data need to be realized using big data GIS. The application of big data GIS in the public security industry mainly relies on cloud GIS technology, distributed storage technology, and stream data processing technology to integrate the basic geographic information database with the public security thematic database, with spatiotemporal information to provide more efficient geographic information services for the business development of various police types.

Using the management of stream data, it can also realize the storage and retrieval of historical data, track playback and other functions. It can know whether the vehicles are traveling according to the required route inspection, and whether there are any problems in the way, and check whether the design of the vehicle inspection route is reasonable, providing a reference for scientific and reasonable allocation of police resources.

In addition, using the spatial analysis technology of big data GIS can also expand a new perspective for the existing public security business. For example, the judgment of the fake license plate vehicle mainly depends on the comparison of the captured license plate and the vehicle model. If the model of the fake license plate is also identical, it is difficult to accurately identify. The "element connection" algorithm of big data spatial analysis can be used to set the analysis and extraction parameters. For example, the same license plate with a distance of more than 10 kilometers within five minutes is suspected to be a fake license plate, which provides more powerful clues from the perspective of space-time combination.

Integrated urban management

With the development of a new generation of smart cities, citizens, transportation, commerce, communications, natural resources, etc. in the city have gradually formed a universal connection. Guo Renzhong, academician of the Chinese Academy of Engineering, believes that "smart cities are based on common facilities and data resources, have a lot of common operations, and need an operating system, and the operating system of smart cities is not GIS". Big data GIS extends the data boundary managed and the technology boundary used on the basis of traditional GIS, bringing new opportunities for the comprehensive management of smart cities.

With the joint development of big data GIS and digital twinning technology, digital models will cover every corner of the city and provide diversified data support for urban comprehensive management. Big data GIS will integrate the spatial and non-spatial, structured and unstructured data of urban multi-source data, and manage the data in an integrated way, making the comprehensive management based on urban digital model possible.

The increasing amount of urban data has broadened the scope of urban management services, and the efficient calculation and query capabilities of big data GIS have become particularly needed. For example, the distribution data of communication base stations can be used to delimit the urban space boundary; Using navigation map, POI (Point of Interest), public comments and other data, it can define and identify urban public space; Use enterprise registration data to simulate the migration direction of enterprises. These can provide more diversified geographic information services for government functions and the public.

With the emergence of new urban management and service requirements, the visualization capabilities of traditional GIS can no longer meet the application requirements. On the basis of distributed rendering, stream data processing and other visualization technologies of big data GIS, it can also realize the integrated display of aboveground and underground, indoor and outdoor, and dynamic static data, bringing fresh experience to government affairs, enterprise management, and civil life.

Under the "two-wheel drive" of industrial applications, big data GIS has become a bridge connecting spatial big data and industrial applications. In addition to the industries mentioned in this article, there are many industries, such as meteorology, water conservancy, environmental protection, military and so on, which are integrating the big data GIS capabilities with the current business platform or system to realize the upgrading and expansion of the big data platform of the GIS industry.

In the future, with the further improvement of hardware configuration and the popularization of cloud computing, cloud native and other technologies, big data GIS technology will also continue to improve. The storage and analysis technology of spatial big data will develop towards greater processing capacity and higher efficiency, and the data that can be carried will also be more complex, changeable and real-time. Big data GIS with built-in distributed technology and streaming data technology will replace traditional GIS and become the default standard configuration of GIS.

With the comprehensive deployment of geographical big data in the "13th Five Year Plan" and the major strategic demand for spatial big data analysis in the "the Belt and Road" construction, big data GIS will play an irreplaceable role in various fields of social economy, and the future application and development prospects are unlimited.

Label :
    Sign in for comments!
Comment list (0)

Powered by TorCMS (https://github.com/bukun/TorCMS).