Semantic Kernel Integration for AI Orchestration
Semantic Kernel (SK) — Microsoft SDK for integrating LLMs into .NET, Python, and Java applications. Aimed at enterprise developers who need strong typing, dependency injection, integration with Azure AI, and corporate systems. Unlike LangChain and LlamaIndex, SK provides an SDK experience close to traditional enterprise development.
Basic Semantic Kernel Structure
import asyncio
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion, OpenAITextEmbedding
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.functions import kernel_function
from semantic_kernel.prompt_template import PromptTemplateConfig
kernel = Kernel()
# Adding AI services
kernel.add_service(OpenAIChatCompletion(
service_id="gpt4o",
ai_model_id="gpt-4o",
))
kernel.add_service(OpenAITextEmbedding(
service_id="embeddings",
ai_model_id="text-embedding-3-small",
))
# Prompt as kernel function
prompt = """You are a corporate data analyst.
Answer the question based on provided context.
Context: {{$context}}
Question: {{$question}}"""
settings = kernel.get_prompt_execution_settings_from_service_id("gpt4o")
settings.max_tokens = 2000
settings.temperature = 0.1
analysis_function = kernel.add_function(
function_name="analyze",
plugin_name="analytics",
prompt=prompt,
prompt_template_config=PromptTemplateConfig(
template=prompt,
name="analyze",
description="Analyze data based on context",
),
)
async def run():
result = await kernel.invoke(
analysis_function,
context="Q1 2025 revenue: 45.2M, plan: 48M, deviation: -5.8%",
question="What are the main causes of deviation and what do you recommend?",
)
print(result)
asyncio.run(run())
Plugins: Reusable Components
from semantic_kernel.functions import kernel_function
from typing import Annotated
class FinancialPlugin:
"""Plugin for financial analysis"""
@kernel_function(
name="calculate_variance",
description="Calculate plan-actual variance as percentage",
)
def calculate_variance(
self,
actual: Annotated[float, "Actual value"],
plan: Annotated[float, "Planned value"],
) -> Annotated[str, "Percentage variance"]:
if plan == 0:
return "Error: planned value is zero"
variance = (actual - plan) / plan * 100
return f"{variance:+.2f}%"
@kernel_function(
name="format_currency",
description="Format number as currency",
)
def format_currency(
self,
amount: Annotated[float, "Amount"],
currency: Annotated[str, "Currency (RUB, USD, EUR)"] = "RUB",
) -> str:
symbols = {"RUB": "₽", "USD": "$", "EUR": "€"}
symbol = symbols.get(currency, currency)
return f"{symbol}{amount:,.0f}"
# Register plugin
kernel.add_plugin(FinancialPlugin(), plugin_name="finance")
# Plugin from directory with YAML/txt prompts
kernel.add_plugin(parent_directory="./plugins", plugin_name="reporting")
Auto Function Calling: Agent Loop
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings
from semantic_kernel.contents import ChatHistory
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
# Settings with automatic function calling
execution_settings = OpenAIChatPromptExecutionSettings(
service_id="gpt4o",
function_choice_behavior=FunctionChoiceBehavior.Auto(
auto_invoke=True, # Automatically call functions
maximum_auto_invoke_attempts=10,
),
)
chat_service = kernel.get_service("gpt4o")
chat_history = ChatHistory()
chat_history.add_system_message("""You are a corporate financial analyst.
Use available functions for accurate calculations.
Answer only based on data.""")
chat_history.add_user_message("Calculate revenue deviation: actual 42.3M, plan 45.0M. Output in rubles.")
result = await chat_service.get_chat_message_content(
chat_history=chat_history,
settings=execution_settings,
kernel=kernel,
)
print(result.content)
# Agent will automatically call calculate_variance and format_currency
Memory and Vector Store
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.connectors.memory.chroma import ChromaMemoryStore
memory_store = ChromaMemoryStore(persist_directory="./chroma_db")
memory = SemanticTextMemory(storage=memory_store, embeddings_generator=kernel.get_service("embeddings"))
# Save information to memory
await memory.save_information(
collection="company_policies",
id="policy_001",
text="Travel allowance policy: 2500 RUB/day in Russia, 80 USD abroad.",
description="Travel",
)
# Search memory
results = await memory.search(
collection="company_policies",
query="What are the allowances for a trip to Moscow?",
limit=3,
min_relevance_score=0.7,
)
for result in results:
print(f"Score: {result.relevance:.3f}: {result.text}")
Azure AI Integration
from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion
from azure.ai.inference.aio import ChatCompletionsClient
from azure.identity.aio import DefaultAzureCredential
# Azure OpenAI
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
kernel.add_service(AzureChatCompletion(
service_id="azure-gpt4o",
deployment_name="gpt-4o",
endpoint="https://your-endpoint.openai.azure.com",
api_key="...",
))
# Azure AI Foundry (Phi, Mistral, Llama via Azure)
client = ChatCompletionsClient(
endpoint="https://your-model.inference.ai.azure.com",
credential=DefaultAzureCredential(),
)
kernel.add_service(AzureAIInferenceChatCompletion(
service_id="phi-4",
ai_model_id="phi-4",
client=client,
))
Practical Case: .NET Enterprise Application with AI
Context: large logistics company (.NET/C# backend) integrated SK to create AI dispatcher assistant.
Plugins:
-
ShipmentPlugin— requests to TMS (transportation system) -
RoutePlugin— route calculation, cost, deadlines -
CustomerPlugin— customer data, order history -
AlertPlugin— sending delay notifications
// C# version
var kernel = Kernel.CreateBuilder()
.AddAzureOpenAIChatCompletion(deploymentName, endpoint, apiKey)
.Build();
kernel.Plugins.AddFromType<ShipmentPlugin>();
kernel.Plugins.AddFromType<RoutePlugin>();
var settings = new OpenAIPromptExecutionSettings {
FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};
var response = await kernel.InvokePromptAsync(
"Where is the cargo for bill TN-12345 now? Are there any delays?",
new KernelArguments(settings)
);
Results:
- Dispatcher response time: 4.5 min → 45 sec
- Integration into existing .NET stack: without architecture rework
- Queries handled without dispatcher: 68%
Timeframes
- Basic SK integration + OpenAI/Azure: 2–4 days
- Developing plugins for business logic: 1–2 weeks
- Agent loop with auto function calling: 1 week
- Integration with corporate .NET systems: 2–4 weeks







