In order to determine the main distribution areas of data points after mapping, this paper proposes an automatic distribution area determination method based on data density. The method can be used to understand the data distribution more intuitively and can be used as a preprocessing means for data cleaning. Based on GEOROC database, the total alkali vs. silicon（TAS）diagram is analyzed and verified in this paper. By extracting SiO2, Na2O, K2O and LOI of rock samples related to TAS diagram in GEOROC database, about 133 thousand valid data of 24 rock types were obtained through routine data cleaning and reduction. The agreement between 24 rock samples and TAS diagram was verified by data points mapping, partition statistics and 80% data distribution area extraction. Through comprehensive research and analysis, it is found that the data distribution of 9 rock types is basically consistent with TAS diagram definition area, and the data distribution of 15 rock types has systematic deviation in TAS diagram definition area. Big data research has proved the deficiency of the TAS diagram. Using Total-Alkali and SiO2 as indicators, it is difficult to improve the accuracy of the overall classification.
Ge Can Gu Haiou Wang Fangyue Li Xiuyu Zhou Yuzhang Yuan Feng. Determination of distribution region based on data density: A case study of TAS diagram[J]. Chinese Journal of Geology, 2018, 53(4): 1240-1253.