AI system for geological exploration and deposit search
The cost of an exploration well is $500,000–$5,000,000. Of 1,000 potential targets, only 1–3 are ever mined. AI reduces the number of "dry" wells by directing exploration to areas with the highest probability of finding ore.
Geospatial data analysis
Predictors of mineralization:
A deposit is the result of the intersection of several geological factors. ML finds combinations of features that predict ore bodies:
import numpy as np
import pandas as pd
import rasterio
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
class MineralProspectivityModel:
"""
Минерально-перспективная модель для поиска оруденения.
Входные данные: геофизика, геохимия, дистанционное зондирование, структурная геология.
"""
def prepare_features(self, geodatasets: dict) -> pd.DataFrame:
"""
geodatasets: словарь {layer_name: raster_path}
Слои: magnetic_anomaly, gravity, dem, radiometry_k, radiometry_th,
geochemistry_cu, geochemistry_au, fault_distance, lithology_encoded
"""
feature_arrays = {}
for layer_name, raster_path in geodatasets.items():
with rasterio.open(raster_path) as src:
data = src.read(1).astype(float)
data[data == src.nodata] = np.nan
feature_arrays[layer_name] = data.flatten()
features_df = pd.DataFrame(feature_arrays)
# Производные признаки: градиенты магнитного поля
if 'magnetic_anomaly' in features_df.columns:
features_df['mag_gradient'] = np.gradient(
features_df['magnetic_anomaly'].values
)
# Расстояние до известных разломов (проводящие пути флюидов)
# fault_distance уже нормализовано в метрах
return features_df.dropna()
def train_prospectivity(self, features_df, known_deposits_mask):
"""
known_deposits_mask: бинарный массив — известные месторождения (позитивы)
Обучение на балансированной выборке: positive = известные, negative = геологически бесперспективные
"""
from imblearn.over_sampling import SMOTE
X = features_df.values
y = known_deposits_mask
# Баланс классов: позитивов мало
sm = SMOTE(sampling_strategy=0.3, random_state=42)
X_res, y_res = sm.fit_resample(X, y)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_res)
model = RandomForestClassifier(
n_estimators=500, max_depth=12,
min_samples_leaf=5, n_jobs=-1, random_state=42
)
model.fit(X_scaled, y_res)
return model, scaler
Input data types and their value:
| Источник данных | Разрешение | Глубина проникновения | Ценность для поиска |
|---|---|---|---|
| Аэромагнитная съёмка | 50–200 м | 500–3000 м | Контуры тел, разломы |
| Гравиметрия | 200–500 м | 5–10 км | Тела базитов, соли |
| Sentinel-2 SWIR | 20 м | Поверхность | Гидроксилы, глины |
| ASTER TIR | 90 м | Поверхность | Минеральный состав |
| Геохимия почв/потоков | Точки отбора | 1–2 м | Прямые индикаторы |
| CSAMT/MT | Профили | 1–5 км | Проводящие зоны |
Geophysical data processing
Seismic interpretation by neural networks:
Manual interpretation of seismograms takes weeks. CNN automates horizon and fault detection:
import torch
import torch.nn as nn
class SeismicHorizonPicker(nn.Module):
"""
U-Net для автоматического выделения сейсмических горизонтов.
Вход: 2D сейсмическая секция [H x W]
Выход: маска горизонтов [H x W]
"""
def __init__(self):
super().__init__()
# Encoder
self.enc1 = self._double_conv(1, 64)
self.enc2 = self._double_conv(64, 128)
self.enc3 = self._double_conv(128, 256)
self.pool = nn.MaxPool2d(2)
# Bottleneck
self.bottleneck = self._double_conv(256, 512)
# Decoder
self.up3 = nn.ConvTranspose2d(512, 256, 2, 2)
self.dec3 = self._double_conv(512, 256)
self.up2 = nn.ConvTranspose2d(256, 128, 2, 2)
self.dec2 = self._double_conv(256, 128)
self.up1 = nn.ConvTranspose2d(128, 64, 2, 2)
self.dec1 = self._double_conv(128, 64)
self.out = nn.Conv2d(64, 1, 1)
def _double_conv(self, in_ch, out_ch):
return nn.Sequential(
nn.Conv2d(in_ch, out_ch, 3, padding=1), nn.BatchNorm2d(out_ch), nn.ReLU(),
nn.Conv2d(out_ch, out_ch, 3, padding=1), nn.BatchNorm2d(out_ch), nn.ReLU()
)
def forward(self, x):
e1 = self.enc1(x)
e2 = self.enc2(self.pool(e1))
e3 = self.enc3(self.pool(e2))
b = self.bottleneck(self.pool(e3))
d3 = self.dec3(torch.cat([self.up3(b), e3], 1))
d2 = self.dec2(torch.cat([self.up2(d3), e2], 1))
d1 = self.dec1(torch.cat([self.up1(d2), e1], 1))
return torch.sigmoid(self.out(d1))
Well Log Analysis:
- Automatic correlation of layers between wells: DTW (Dynamic Time Warping) on GR, SP, and resistivity curves - Lithological classification: Random Forest on a well logging system → 10–15 lithotypes - Porosity and oil saturation assessment: neural network on Core → Log (calibration)
Probabilistic resource assessment
Monte Carlo inventory modeling:
JORC/CRIRSCO require the uncertainty to be specified. ML + MC provides a range instead of a point estimate:
from scipy.stats import norm, lognormal
import numpy as np
def estimate_resources_montecarlo(
kriging_grades, kriging_variances,
density=2.8, n_simulations=10000
):
"""
Оценка металлических ресурсов с неопределённостью.
kriging_grades: сетка средних содержаний по блокам
kriging_variances: дисперсия кригинга по блокам
"""
block_volume_m3 = 10 * 10 * 5 # 10x10x5 м блоки
results = []
for sim in range(n_simulations):
# Симулировать содержание в каждом блоке
simulated_grades = np.random.normal(
loc=kriging_grades,
scale=np.sqrt(kriging_variances)
)
simulated_grades = np.clip(simulated_grades, 0, None)
# Подсчёт металла
tonnage = kriging_grades.size * block_volume_m3 * density / 1000 # тонны
metal_tonnes = tonnage * np.mean(simulated_grades) / 100
results.append(metal_tonnes)
p10 = np.percentile(results, 10)
p50 = np.percentile(results, 50)
p90 = np.percentile(results, 90)
return {'P10': p10, 'P50': p50, 'P90': p90,
'uncertainty_ratio': (p90 - p10) / p50}
Remote sensing in geological exploration
Hyperspectral analysis:
AVIRIS, HyMap, PRISMA: 200+ spectral channels → surface mineral map: - SWIR (2.0–2.5 µm) → kaolinite, illite, montmorillonite (hydrothermal reworking = mineralization indicator) - SAM (Spectral Angle Mapper) + neural network for precise mineral separation - Time history: Sentinel-2 multispectral series → active geochemical color anomalies
CV for deciphering geological structures:
- Recognition of lineaments (faults) on DEM and images: LSD algorithm + neural network filtering - 3D reconstruction of geological outcrop based on photogrammetry (DJI Phantom + RealityCapture → geological map) - Automatic application of bedding elements based on core photographs
Development timeframe: 4–7 months for an AI exploration system with prospective modeling, geophysical data processing, and probabilistic resource assessment.







