MLOps Platform Setup and Configuration

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
MLOps Platform Setup and Configuration
Complex
~2-4 weeks
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1170
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1094
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    830
  • image_logo-aider_0.jpg
    AIDER company logo development
    763
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    879

Deploying an MLOps platform

The MLOps platform is an integrated stack of tools for the full lifecycle of ML models: data, experiments, training, deployment, and monitoring. Eliminates the "it only works on my laptop" situation.

MLOps platform components

Layer Open Source Managed
Experiments MLflow, W&B W&B, Comet
Feature Store Feast, Hopsworks Tecton
Pipelines Kubeflow, Airflow AWS SageMaker Pipelines
Model Registry MLflow W&B, Vertex AI
Serving vLLM, Triton, Seldon AWS SageMaker Endpoints
Monitoring Evidently, Prometheus AWS Model Monitor
Data Versioning DVC, Delta Lake Databricks

Self-hosted stack on Kubernetes

# mlops-platform/values.yaml для Helm umbrella chart
mlflow:
  enabled: true
  artifactRoot: s3://mlops-artifacts/mlflow
  backendStore: postgresql://mlops:xxx@postgres:5432/mlflow
  service:
    type: ClusterIP
    port: 5000

feast:
  enabled: true
  registry:
    type: sql
    path: postgresql://mlops:xxx@postgres:5432/feast
  online_store:
    type: redis
    connection_string: redis:6379

kubeflow:
  enabled: true

seldon_core:
  enabled: true
  usageMetrics: true

prometheus:
  enabled: true
  storage: 30Gi

grafana:
  enabled: true
  adminPassword: changeme123
# Развёртывание
helm repo add mlops-platform https://charts.mlops-platform.io
helm install mlops mlops-platform/full-stack \
  -f values.yaml \
  --namespace mlops \
  --create-namespace

MLflow as a central component

MLflow is the minimum required component for any MLOps platform:

import mlflow
import mlflow.sklearn
from mlflow.models import infer_signature

# Настройка tracking сервера
mlflow.set_tracking_uri("http://mlflow.mlops.svc.cluster.local:5000")
mlflow.set_experiment("fraud-detection-v2")

with mlflow.start_run(run_name="lgbm-baseline") as run:
    # Логирование параметров
    mlflow.log_params({
        "n_estimators": 500,
        "learning_rate": 0.05,
        "max_depth": 6,
    })

    # Обучение
    model = LGBMClassifier(**params)
    model.fit(X_train, y_train)

    # Метрики
    y_pred = model.predict(X_test)
    mlflow.log_metrics({
        "precision": precision_score(y_test, y_pred),
        "recall": recall_score(y_test, y_pred),
        "f1": f1_score(y_test, y_pred),
        "roc_auc": roc_auc_score(y_test, model.predict_proba(X_test)[:, 1])
    })

    # Логирование модели с сигнатурой
    signature = infer_signature(X_train, model.predict(X_train))
    mlflow.sklearn.log_model(
        model,
        artifact_path="model",
        signature=signature,
        registered_model_name="fraud-detection"
    )

    # Артефакты: feature importance, confusion matrix
    mlflow.log_figure(plot_feature_importance(model), "feature_importance.png")
    mlflow.log_artifact("shap_values.html")

print(f"Run ID: {run.info.run_id}")

Promotion workflow

from mlflow.tracking import MlflowClient

client = MlflowClient()

# Регистрация лучшей модели из эксперимента
best_run = client.search_runs(
    experiment_ids=[experiment.experiment_id],
    order_by=["metrics.f1 DESC"],
    max_results=1
)[0]

model_version = mlflow.register_model(
    f"runs:/{best_run.info.run_id}/model",
    name="fraud-detection"
)

# Staging → Production workflow
client.transition_model_version_stage(
    name="fraud-detection",
    version=model_version.version,
    stage="Staging",
    archive_existing_versions=False
)

# После валидации на staging
client.transition_model_version_stage(
    name="fraud-detection",
    version=model_version.version,
    stage="Production",
    archive_existing_versions=True  # архивируем предыдущую Production
)

Feast Feature Store

from feast import FeatureStore

store = FeatureStore(repo_path="./feature_repo")

# Получение исторических фичей для обучения
training_df = store.get_historical_features(
    entity_df=entity_df_with_timestamps,
    features=[
        "customer_stats:transaction_count_7d",
        "customer_stats:avg_amount_30d",
        "merchant_stats:fraud_rate_90d",
    ]
).to_df()

# Online inference — те же фичи, без skew
online_features = store.get_online_features(
    features=["customer_stats:transaction_count_7d", "customer_stats:avg_amount_30d"],
    entity_rows=[{"customer_id": "12345", "merchant_id": "MCC001"}]
).to_dict()

Monitoring production models

import evidently
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, ClassificationPreset

# Еженедельный отчёт о дрейфе
report = Report(metrics=[DataDriftPreset(), ClassificationPreset()])
report.run(reference_data=training_data, current_data=production_data_last_week)
report.save_html("drift_report.html")

# Метрика для Prometheus
data_drift_score = report.as_dict()["metrics"][0]["result"]["dataset_drift"]
if data_drift_score:
    alerts.send("Data drift detected", severity="warning")

Deployment timeframes

Week 1–2: MLflow tracking server + S3 artifact store, basic experiment logging

Week 3–4: Model registry, promotion workflow, first production model

Month 2: Feast feature store, fixing training-serving skew

Month 3: Kubeflow Pipelines, automatic retrain pipelines

Month 4: Evidently drift monitoring, full integration of all components