留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于主成分分析和最近邻算法的断层识别研究

邹冠贵 任珂 吉寅 丁建宇 张少敏

邹冠贵, 任珂, 吉寅, 丁建宇, 张少敏. 基于主成分分析和最近邻算法的断层识别研究[J]. 煤田地质与勘探, 2021, 49(4): 15-23. doi: 10.3969/j.issn.1001-1986.2021.04.003
引用本文: 邹冠贵, 任珂, 吉寅, 丁建宇, 张少敏. 基于主成分分析和最近邻算法的断层识别研究[J]. 煤田地质与勘探, 2021, 49(4): 15-23. doi: 10.3969/j.issn.1001-1986.2021.04.003
ZOU Guangui, REN Ke, JI Yin, Ding Jianyu, ZHANG Shaomin. Fault recognition based on principal component analysis and k-nearest neighbor algorithm[J]. COAL GEOLOGY & EXPLORATION, 2021, 49(4): 15-23. doi: 10.3969/j.issn.1001-1986.2021.04.003
Citation: ZOU Guangui, REN Ke, JI Yin, Ding Jianyu, ZHANG Shaomin. Fault recognition based on principal component analysis and k-nearest neighbor algorithm[J]. COAL GEOLOGY & EXPLORATION, 2021, 49(4): 15-23. doi: 10.3969/j.issn.1001-1986.2021.04.003

基于主成分分析和最近邻算法的断层识别研究

doi: 10.3969/j.issn.1001-1986.2021.04.003
基金项目: 

国家重点研发计划课题 2018YFC0807803

详细信息
    第一作者:

    邹冠贵,1981年生,男,福建龙岩人,博士,副教授,博士生导师,从事地震解释、岩石物理学研究. E-mail:cumtzgg@foxmail.com

    通信作者:

    任珂,1993年生,男,山东潍坊人,博士研究生,从事地震解释研究. E-mail:renke666@foxmail.com

  • 中图分类号: P315.9

Fault recognition based on principal component analysis and k-nearest neighbor algorithm

  • 摘要: 断层是影响煤矿安全的致灾地质因素,查明断层特征是煤矿三维地震勘探的主要目的之一。常规断层解释中采用的人机交互解释方法,其可靠性在一定程度上取决于解释者的经验。为提高断层解释精度,提出一种基于主成分分析和最近邻算法来检测沿目标层断层分布的方法。首先,选择峰峰矿区羊东煤矿作为研究区域,从矿区高精度处理后获得的三维地震数据中提取10个地震属性;然后,采用主成分分析法(PCA)将上述10个地震属性整合为6个综合属性;同时,将属性信息与从矿区15口井和3条巷道确定的139个点的断层信息相结合,构建已知数据信息;在该数据信息的基础上,分别组建出数据集1和数据集2两种数据集,2种数据集的训练集与测试集的比分别为9∶1和3∶7。利用这些数据集以及十折交叉验证的方法,开展基于最近邻算法(kNN)的断层识别准确率测试,数据集1的测试准确率为87.75%,数据集2的测试准确率为71.63%;这表明训练数据量越大,断层识别准确率越高,从而也说明高密度三维地震在该方法的应用中存在一定优势。在对kNN模型的分类性能进行测试时,使用通过PCA进行降维处理的数据作为输入,计算出的分类准确率分别为89.23%和73.79%;这是因为PCA降低了原始输入特征的维数,从而减少了所需的计算量并提高了这些特征的表征能力。综合结果表明,结合PCA和kNN方法可以有效地识别断层分布,减少主观人为因素的影响,提高断层解释的效率。

     

  • 图  PCA算法流程

    Fig. 1  PCA algorithm process

    图  最近邻算法的原理

    Fig. 2  Principle of the k-nearest neighbor algorithm

    图  羊东煤矿2号煤层已有钻井、巷道分布情况

    Fig. 3  Distribution of existing drilling wells and roadways of No.2 coal seam in Yangdong Coal Mine

    图  每组数据数量

    Fig. 4  The number of data set in each group

    图  各组验证集分类准确率随k取值的变化

    Fig. 5  Variation of classification accuracy of each verification set with the value of k

    图  验证集平均准确率随k取值的变化规律

    Fig. 6  Variation of the average accuracy of the test set with the value of k

    图  kNN断层预测分布与人工解释结果对比

    Fig. 7  Comparison of kNN fault prediction distribution and artificial interpretation results

    表  1  139个点位置坐标及其对应的构造信息

    Table  1  Position coordinates and their corresponding geological information of the 139 points

    序号 X坐标 Y坐标 是否存在断层 信息来源 序号 X坐标 Y坐标 是否存在断层 信息来源
    1 525992 4039808 巷道1 128 524834 4039530 井1402
    2 525992 4039808 巷道1 129 525672 4039327 井1404
    $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ 130 525337 4039388 井148
    44 525976 4039789 巷道2 131 524960 4039741 井141
    45 525944 4039817 巷道2 132 525224 4039824 井144
    $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ 133 525772 4039573 井145
    82 525958 4039766 巷道3 134 525386 4039710 井149
    83 525927 4039795 巷道3 135 525475 4040141 井1301
    $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ 136 525665 4040023 井136
    125 524887 4039379 井155 137 525054 4039979 井1303
    126 525257 4039147 井156 138 525225 4040225 井130
    127 525592 4039078 井157 139 525285 4040421 井1204
    下载: 导出CSV

    表  2  羊东煤矿2号煤层已知数据集

    Table  2  Known data on geological information of the No.2 coal seam in Yangdong Coal Mine

    方差 衰减系数 走向曲率 反射强度 瞬时相位 最大振幅 瞬时频率 倾角偏差 倾角连续性 混沌体 label
    0.03 0.05 0.02 535.46 110.37 220 983.1 43.39 0.02 0.08 0.05 1
    0.01 0.02 0.01 1 211.85 106.04 72 728.63 46.09 0 0.01 0.01 0
    $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $
    0.02 0.05 0 410.21 116.25 108 717.8 37.72 0 0.08 0.01 0
    下载: 导出CSV

    表  3  主成分特征值及其方差贡献率

    Table  3  Principal component eigenvalue and its variance contribution rate

    主成分编号 特征值 方差贡献率/% 累计方差贡献率/% 主成分编号 特征值 方差贡献率/% 累计方差贡献率/%
    1 2.559 25.585 25.585 6 0.916 9.161 80.503
    2 1.486 14.862 40.447 7 0.736 7.356 87.859
    3 1.141 11.407 51.854 8 0.552 5.517 93.376
    4 0.999 9.993 61.847 9 0.493 4.927 98.303
    5 0.950 9.495 71.343 10 0.170 1.697 100.000
    下载: 导出CSV

    表  4  综合属性数据集

    Table  4  Integrated attribute data set

    样本点 Y1 Y2 Y3 Y4 Y5 Y6
    1 0.623 25 –0.003 02 –0.798 82 1.624 09 1.256 43 –1.289 87
    2 0.604 85 0.264 97 –0.836 58 0.084 20 –0.408 08 –1.456 64
    3 0.475 82 0.193 16 –0.979 45 0.866 78 –0.336 68 –1.396 93
    $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $
    47 442 –0.169 09 1.711 36 –0.007 00 –3.548 22 0.643 57 –0.643 87
    下载: 导出CSV
  • [1] 董守华, 石亚丁, 汪洋. 地震多参数BP人工神经网络自动识别小断层[J]. 中国矿业大学学报, 1997, 26(3): 14-18.. doi: 10.3321/j.issn:1000-1964.1997.03.004

    DONG Shouhua, SHI Yading, WANG Yang. Automatic recognition of small fault by BP artificial nervous network from multiple seismic parameters[J]. China University of Mining and Technology, 1997, 26(3): 14-18.. doi: 10.3321/j.issn:1000-1964.1997.03.004
    [2] BAHORICH M, FARMER S L. 3-D seismic discontinuity for faults and stratigraphic features: The coherence cube[J]. AAPG Bulletin, 1995, 14(10): 1566. http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=LEEDFF000014000010001053000001&idtype=cvips&gifs=Yes
    [3] ALBINHASSAN N M, MARFURT K. Fault detection using Hough transforms[C]//Society of Exploration Geophysicists. SEG Technical Program Expanded Abstracts 2003. 2003: 1719-1721.
    [4] PEDERSEN S I, RANDEN T, SONNELAND L, et al. Automatic fault extraction using artificial ants[C]//Society of Exploration Geophysicists. SEG Technical Program Expanded Abstracts 2002. 2002: 512-515.
    [5] ADMASU F, BACK S, TOENNIES K. Autotracking of faults on 3D seismic data[J]. Geophysics, 2006, 71(6): 49-53.. doi: 10.1190/1.2358399
    [6] LU Cai, YUAN Mingkai, WANG Qi, et al. Application of multi-attributes fused volume rendering techniques in 3D seismic interpretation[C]//Society of Exploration Geophysicists. SEG Technical Program Expanded Abstracts 2014. 2014: 1609-1613.
    [7] 孙振宇, 彭苏萍, 邹冠贵. 基于SVM算法的地震小断层自动识别[J]. 煤炭学报, 2017, 42(11): 2945-2952. https://www.cnki.com.cn/Article/CJFDTOTAL-MTXB201711020.htm

    SUN Zhenyu, PENG Suping, ZOU Guangui, Automatic identification of small faults based on SVM and seismic data[J]. Journal of China Coal Society, 2017, 42(11): 2945-2952. https://www.cnki.com.cn/Article/CJFDTOTAL-MTXB201711020.htm
    [8] DI Haibin, SHAFIQ A, WANG Zhen, et al. Improving seismic fault detection by super-attribute-based classification[J]. Interpretation, 2019, 7(3): 251-267.. doi: 10.1190/INT-2018-0188.1
    [9] ZOU Guangui, REN Ke, SUN Zhenyu, et al. Fault interpretation using a support vector machine: A study based on 3D seismic mapping of the Zhaozhuang Coal Mine in the Qinshui Basin, China[J]. Journal of Applied Geophysics, 2019, 171: 103870.. doi: 10.1016/j.jappgeo.2019.103870
    [10] BARNES A E. A filter to improve seismic discontinuity data for fault interpretation[J]. Geophysics, 2006, 71(3): 1. http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2006Geop...71P...1B&db_key=PHY&link_type=ABSTRACT
    [11] BAVKAR S, SAHARE S. PCA based single channel speech enhancement method for highly noisy environment[C]//2013 International Conference on Advances in Computing, Communications and Informatics(ICACCI). IEEE, 2013: 1103-1107.
    [12] IWAI M, KOBAYASHI K. Noise reduction in magnetocardiograph based on time-shift PCA just using measurement data[C]// IEEE. 2018 IEEE International Magnetics Conference(INTER MAG). 2018: 1.
    [13] ARAYA M, DAHLKE T, FROGNER C, et al. Automated fault detection without seismic processing[J]. The Leading Edge, 2017, 36(3): 208-214.. doi: 10.1190/tle36030208.1
    [14] JAAFAR H, RAMLI N H, NASIR A S A. An improvement to the k-nearest neighbor classifier for ECG database[J]. IOP Conference Series: Materials Science and Engineering, 2018, 318: 012046.. doi: 10.1088/1757-899X/318/1/012046
    [15] AHA W, KIBLER D, ALBERT M. Instance-based learning algorithms[J]. Machine Learning, 1991, 6(1): 37-66. http://archive.numdam.org/numdam-bin/item?id=ITA_2014__48_2_209_0
    [16] WOLD S. Principal component analysis[J]. Chemometrics & Intelligent Laboratory Systems, 1987, 2(1): 37-52.
    [17] JOLLIFFE I T. Principal component analysis[J]. Journal of Marketing Research, 2002, 87(4): 513. http://www.worldcat.org/wcpa/oclc/48907152?page=frame&url=http%3A%2F%2Fwww.gbv.de%2Fdms%2Fgoettingen%2F346510511.pdf%26checksum%3D1e5f69cdbb2fbf6a42fa3aff3465c07c&title=&linktype=digitalObject&detail=
    [18] MINCHAI H, ZHENMIN Q. Identification of the pesticide fluorescence spectroscopy based on the PCA and KNN[C]// IEEE. 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE). 2010, 3: 184-186.
    [19] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.

    ZHOU Zhihua. Machine learning[M]. Beijing: Tsinghua University Press, 2016.
    [20] COVER T, HART P. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13(1): 21-27.. doi: 10.1109/TIT.1967.1053964
    [21] PETERSON L. K-nearest neighbor[J]. Scholarpedia, 2009, 4(2): 1883.. doi: 10.4249/scholarpedia.1883
    [22] COST S, SALZBERG S. A weighted nearest neighbor algorithm for learning with symbolic features[J]. Machine Learning, 1993, 10(1): 57-78.. doi: 10.1007/BF00993481
    [23] II R, FUKUNAGA K. The optimal distance measure for nearest neighbor classification[J]. IEEE Transactions on Information Theory, 1981, 27(5): 622-627.. doi: 10.1109/TIT.1981.1056403
    [24] WRONA T, PAN I, GAWTHORPE R L, et al. Seismic facies analysis using machine learning[J]. Geophysics, 2018, 83(5): 83-95.. doi: 10.1190/geo2017-0595.1
    [25] ZHANG Zhongheng. Introduction to machine learning: K-nearest neighbors[J]. Annals of Translational Medicine, 2016, 4(11): 218.. doi: 10.21037/atm.2016.03.37
  • 加载中
图(7) / 表(4)
计量
  • 文章访问数:  215
  • HTML全文浏览量:  22
  • PDF下载量:  25
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-10-14
  • 修回日期:  2020-11-11
  • 发布日期:  2021-08-25
  • 网络出版日期:  2021-09-10

目录

    /

    返回文章
    返回