Website internal linking structure optimization

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

Optimizing Internal Linking Structure of Your Website

Internal Linking — a system of internal links that distributes link weight (PageRank) between pages and helps search robots discover and index content. Proper structure makes important pages more authoritative.

Audit of Current Structure

import scrapy
import networkx as nx

class InternalLinksSpider(scrapy.Spider):
    name = 'internal_links'
    start_urls = ['https://company.com']

    def __init__(self):
        self.graph = nx.DiGraph()

    def parse(self, response):
        current_url = response.url

        for link in response.css('a[href]::attr(href)').getall():
            absolute = response.urljoin(link)
            if 'company.com' in absolute:
                self.graph.add_edge(current_url, absolute)
                yield response.follow(absolute, self.parse)

    def closed(self, reason):
        # Pages with highest PageRank
        pagerank = nx.pagerank(self.graph)
        top_pages = sorted(pagerank.items(), key=lambda x: x[1], reverse=True)[:20]

        # Orphan pages (no incoming links)
        orphans = [node for node in self.graph.nodes()
                   if self.graph.in_degree(node) == 0
                   and node != 'https://company.com']

        print(f"Orphan pages: {len(orphans)}")
        for url in orphans[:10]:
            print(f"  {url}")

Metrics for analysis:

  • Orphan pages — pages with no incoming internal links
  • Crawl depth — nesting depth (important pages should be 1–3 clicks from home)
  • PageRank distribution — is weight distributed evenly

Principles of Proper Structure

Flat hierarchy — important pages close to home:

Home → Category → Product page (maximum 3 clicks)

Thematic clusters — pages on same topic link to each other:

Pillar page (main): /guide/seo
  ↔ /guide/seo/technical
  ↔ /guide/seo/on-page
  ↔ /guide/seo/link-building

Breadcrumbs — automatic internal links with Schema.org markup:

<nav aria-label="breadcrumb">
  <ol itemscope itemtype="https://schema.org/BreadcrumbList">
    <li itemprop="itemListElement" itemscope itemtype="https://schema.org/ListItem">
      <a itemprop="item" href="/"><span itemprop="name">Home</span></a>
      <meta itemprop="position" content="1">
    </li>
  </ol>
</nav>

Automatic Related Articles

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def find_related_articles(target_article, all_articles, top_n=5):
    texts = [a['title'] + ' ' + a['body'] for a in all_articles]
    target_text = target_article['title'] + ' ' + target_article['body']

    vectorizer = TfidfVectorizer(max_features=1000, stop_words='english')
    tfidf_matrix = vectorizer.fit_transform([target_text] + texts)

    similarities = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:]).flatten()
    top_indices = similarities.argsort()[-top_n:][::-1]

    return [all_articles[i] for i in top_indices if similarities[i] > 0.1]

Anchor Text Optimization

Anchor text tells search engines about the topic of the target page:

Bad: <a href="/guide/seo">here</a>
Bad: <a href="/guide/seo">click</a>

Good: <a href="/guide/seo">SEO guide</a>
Good: <a href="/guide/technical-seo">technical SEO audit</a>

Fixing Orphan Pages

def fix_orphan_pages(orphan_urls, content_db):
    """Find logical place to add links to orphan pages"""
    for url in orphan_urls:
        page = content_db.get_by_url(url)
        keywords = extract_keywords(page['title'])

        # Find pages mentioning these keywords
        related = content_db.search(keywords, exclude_url=url, limit=5)

        for related_page in related:
            print(f"Add link to {url} from {related_page['url']}")

Timeline

Internal linking audit + recommendations for structure improvement — 2–3 business days.