Implementing AI Text Summarization on Website
Text summarization is one of the most demanded tasks for content sites: news aggregators, legal portals, medical references, knowledge bases. Users get a brief summary in 2–3 seconds instead of reading a 10-page document.
Summarization Approaches
Extractive summarization — select key sentences from original text unchanged. Fast, predictable, no hallucinations. Implemented via sumy, gensim, or TextRank.
Abstractive summarization — generate new text conveying essence. Higher quality but requires LLM. Use for texts up to 4000 tokens.
Hybrid approach — extractive reduces text to 20% original, then LLM creates final summary. Works with documents of any length.
Integration via LLM API
For most tasks, OpenAI GPT-4o-mini or Anthropic Claude Haiku suffice — cheaper and handle summarization well.
from openai import OpenAI
client = OpenAI()
def summarize_text(text: str, max_words: int = 150, language: str = "en") -> str:
prompt = f"""Create a concise summary of the following text in {language}.
Maximum {max_words} words. Keep key facts, numbers, and conclusions.
Do not add introductory phrases like "This text discusses".
Text:
{text}"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
max_tokens=300,
temperature=0.3
)
return response.choices[0].message.content
Temperature 0.3 gives stable results without unnecessary variation.
Processing Long Documents
GPT-4o-mini context window is 128K tokens, but sending whole document is expensive. Optimal for 5000+ word texts:
- Split document into 1500–2000 word chunks with 200 word overlap
- Summarize each chunk independently
- Combine intermediate summaries and summarize again
def chunk_text(text: str, chunk_size: int = 1500, overlap: int = 200) -> list[str]:
words = text.split()
chunks = []
start = 0
while start < len(words):
end = start + chunk_size
chunks.append(" ".join(words[start:end]))
start = end - overlap
return chunks
def summarize_long_document(text: str) -> str:
chunks = chunk_text(text)
chunk_summaries = [summarize_text(chunk, max_words=100) for chunk in chunks]
combined = "\n\n".join(chunk_summaries)
return summarize_text(combined, max_words=200)
Caching Results
Summarizing same text repeatedly wastes money. Cache by text hash:
import hashlib
import redis
cache = redis.Redis()
CACHE_TTL = 86400 * 7 # 7 days
def get_summary_cached(text: str, **kwargs) -> str:
key = "summary:" + hashlib.sha256(text.encode()).hexdigest()
cached = cache.get(key)
if cached:
return cached.decode()
summary = summarize_text(text, **kwargs)
cache.setex(key, CACHE_TTL, summary)
return summary
UI Component
function TextSummary({ text, maxLength = 150 }) {
const [summary, setSummary] = useState('');
const [loading, setLoading] = useState(false);
async function fetchSummary() {
setLoading(true);
const res = await fetch('/api/summarize', {
method: 'POST',
body: JSON.stringify({ text, max_words: maxLength }),
});
const data = await res.json();
setSummary(data.summary);
setLoading(false);
}
return (
<div className="summary-box">
<button onClick={fetchSummary} disabled={loading}>
{loading ? 'Summarizing...' : 'Get Summary'}
</button>
{summary && <p className="summary-text">{summary}</p>}
</div>
);
}
Timeline
- Abstractive summarization via API — 2–3 days
- Long document handling (chunking) — plus 1–2 days
- Extractive + abstractive hybrid — plus 2–3 days
- UI integration + caching — plus 1–2 days
- Multi-language + quality metrics — 2–3 weeks







