Sphinx Search Integration for Website
Sphinx Search and its actively developed fork Manticore Search — proven search engines with long history in high-load projects. Used where MySQL-protocol compatibility, solution maturity, and direct index control matter. Manticore Search added JSON-document support, columnar storage, HTTP API, and real-time (RT) indexes.
When Sphinx / Manticore Chosen
- Legacy projects with existing Sphinx infrastructure
- MySQL compatibility needed — connect via standard clients
- Corpus of hundreds of millions documents: news aggregators, archives
- Full control over tokenizer and morphology
- Integration with existing MySQL/MariaDB replication
Manticore Search vs Sphinx
| Feature | Sphinx 3.x | Manticore Search 6.x |
|---|---|---|
| Development | Slowed | Active |
| RT-indexes | Basic | Full-featured |
| JSON-documents | No | Yes |
| HTTP API | No | Yes (Elasticsearch-compatible) |
| Columnar storage | No | Yes |
| MySQL-protocol | Yes | Yes |
For new projects — prefer Manticore Search.
Installation
# docker-compose.yml
services:
manticore:
image: manticoresearch/manticore:6.2.12
environment:
- EXTRA=1
ports:
- "9306:9306" # MySQL-compatible port
- "9308:9308" # HTTP API
volumes:
- manticore_data:/var/lib/manticore
- ./manticore.conf:/etc/manticoresearch/manticore.conf
Index Configuration
# manticore.conf
index articles {
type = rt
path = /var/lib/manticore/articles
rt_field = title
rt_field = body
rt_field = author
rt_attr_uint = category_id
rt_attr_bigint = created_at
rt_attr_float = rating
rt_attr_string = slug
morphology = stem_ru, stem_en
min_word_len = 2
expand_keywords = 1
min_infix_len = 3
stopwords = /etc/manticoresearch/stopwords_ru.txt
}
searchd {
listen = 0.0.0.0:9306:mysql41
listen = 0.0.0.0:9308:http
log = /var/log/manticore/searchd.log
query_log = /var/log/manticore/query.log
max_matches = 10000
}
MySQL Client Connection (PHP)
Manticore accepts SQL on port 9306 via MySQL protocol:
$pdo = new PDO('mysql:host=localhost;port=9306;charset=utf8', '', '');
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
Insert document:
$stmt = $pdo->prepare("
INSERT INTO articles (id, title, body, author, category_id, created_at, rating)
VALUES (:id, :title, :body, :author, :category_id, :created_at, :rating)
");
$stmt->execute([
'id' => $article->id,
'title' => $article->title,
'body' => strip_tags($article->content),
'author' => $article->user->name,
'category_id' => $article->category_id,
'created_at' => $article->created_at->timestamp,
'rating' => $article->rating,
]);
Full-text search with field weights:
$stmt = $pdo->prepare("
SELECT id, title, author, rating,
WEIGHT() AS relevance
FROM articles
WHERE MATCH(:query)
AND category_id = :category_id
ORDER BY relevance DESC, rating DESC
LIMIT :offset, :limit
OPTION ranker=bm25, field_weights=(title=10, body=1, author=2)
");
HTTP API (Elasticsearch-compatible)
# Indexing
curl -X POST 'http://localhost:9308/articles/_doc/1' \
-H 'Content-Type: application/json' \
-d '{
"title": "Article Title",
"body": "Article text",
"category_id": 5
}'
# Full-text search
curl -X POST 'http://localhost:9308/articles/_search' \
-H 'Content-Type: application/json' \
-d '{
"query": { "match": { "title": "search query" } },
"sort": [{ "_score": "desc" }],
"size": 20
}'
Database Sync
Batch indexing via chunk:
public function handle(): void
{
$pdo = $this->getManticoreConnection();
Article::with('user', 'category')
->where('published', true)
->chunk(1000, function ($articles) use ($pdo) {
$pdo->beginTransaction();
foreach ($articles as $article) {
$pdo->prepare("REPLACE INTO articles ...")->execute(
$this->toDocument($article)
);
}
$pdo->commit();
});
}
Event-driven update via Observer:
class ArticleObserver
{
public function saved(Article $article): void
{
ManticoreIndexJob::dispatch($article->id);
}
public function deleted(Article $article): void
{
ManticoreDeleteJob::dispatch($article->id);
}
}
Russian Morphology
Sphinx/Manticore includes stem_ru. For accurate morphology use morphology = lemmatize_ru_all — requires separate lemmatize dictionaries. Finds "running" by query "run".
Stop-words excluded via file — prepositions, conjunctions, particles.
Result Highlighting
SELECT id, title,
SNIPPET(body, :query,
'limit=200, around=5, html_strip_mode=strip') AS excerpt
FROM articles
WHERE MATCH(:query)
LIMIT 20
SNIPPET() returns text fragment with highlighted matches — ready for display.
Work Timelines
| Stage | Time |
|---|---|
| Installation, config, index schema | 1 day |
| Initial indexing + synchronizer | 2 days |
| Search API + tests | 2 days |
| UI: results, snippets, pagination | 1–2 days |
Total: 6–7 working days.







