Tom Osman
[ RETURN_TO_ARCHIVE ]

Making Your Personal Website AI-Agent Friendly

// January 21, 2026

As AI agents become the primary way people discover and access information, ensuring your personal website is discoverable by these systems is no longer optional—it's essential.

Today, I implemented comprehensive AI agent support for tomosman.com. Here's everything I learned and the exact steps to do the same for your site.

Why It Matters

The Shift to AI-First Discovery

Traditional SEO focused on Google rankings. But a new reality is emerging:

  • ChatGPT has 180+ million weekly active users
  • Perplexity processes millions of daily queries
  • Claude is integrated into countless workflows
  • AI agents are becoming the interface between humans and information

If someone asks an AI about you or your field, your site should be part of the knowledge base.

The Opportunity

Most personal websites are invisible to AI agents. By optimizing for AI discovery, you can:

  • Appear in AI-generated answers and citations
  • Train AI models on your content
  • Become a trusted source in your niche
  • Capture traffic from AI-first users

The Foundation: robots.txt

The robots.txt file controls which bots can access your site. Most sites only allow basic search crawlers.

I configured mine to allow 40+ AI agents:

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      // OpenAI (ChatGPT)
      { userAgent: "GPTBot", allow: "/" },
      { userAgent: "ChatGPT-User", allow: "/" },
      
      // Anthropic (Claude)
      { userAgent: "ClaudeBot", allow: "/" },
      { userAgent: "Claude-Web", allow: "/" },
      
      // AI Search Engines
      { userAgent: "PerplexityBot", allow: "/" },
      { userAgent: "PhindBot", allow: "/" },
      { userAgent: "ExaBot", allow: "/" },
      
      // ... and 30+ more
      
    ],
    sitemap: "https://yoursite.com/sitemap.xml",
  };
}

Key AI Bots to Allow

BotSourcePurpose
GPTBotOpenAIChatGPT training
ChatGPT-UserOpenAIChatGPT user interactions
OAI-SearchBotOpenAISearchGPT indexing
ClaudeBotAnthropicClaude training
Claude-WebAnthropicClaude web browsing
PerplexityBotPerplexityAI search queries
YouBotYou.comAI search engine
PhindBotPhindDeveloper search AI
ExaBotExaNeural search engine
Google-ExtendedGoogleGemini AI
Applebot-ExtendedAppleApple Intelligence
BytespiderByteDanceTikTok AI
AmazonbotAmazonAlexa AI services
CCBotCommon CrawlWeb archiving
FacebookbotMetaAI training
LinkedInBotLinkedInProfessional AI
Grok-botX/TwitterGrok AI
AI2BotAllen InstituteAcademic AI research
cohere-aiCohereEnterprise AI
TimpibotTimpAI search
FirecrawlAgentFirecrawlWeb scraping for AI

Structured Data: Person Schema

AI systems rely heavily on structured data to understand content. I added comprehensive Person schema to the site:

const jsonLd = {
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Tom Osman",
  "jobTitle": "Technologist & Educator",
  "description": "Explores the frontier of digital technologies.",
  "url": "https://www.tomosman.com",
  "sameAs": [
    "https://x.com/tomosman",
    "https://github.com/tomcharlesosman",
    "https://youtube.com/@tomosman",
    "https://linkedin.com/in/thomascharlesosman/"
  ],
  "knowsAbout": [
    "Artificial Intelligence",
    "No-Code Development",
    "Automation",
    "Developer Relations"
  ],
  "worksFor": {
    "@type": "Organization",
    "name": "Shiny Technologies"
  }
};

This schema helps AI systems understand:

  • Who you are
  • What you do
  • Your areas of expertise
  • Where to find you online

AI Content Policies

I created two new files specifically for AI systems:

1. llms.txt

This file (inspired by the llms.txt specification) explicitly states your content policies:

# llms.txt - AI Content Policy

## Allowed Content
All public content is available for:
- AI training and model improvement
- AI-powered search and answer generation
- Citation and reference in AI-generated responses

## Disallowed Content
- /private/ - Private areas
- /api/ - API endpoints

## About the Site
Tom Osman explores the frontier of digital technologies.
Daily livestreams, educational guides, and curated tools.

2. llms-full-text.txt

A full-text summary of your site for AI training:

# llms-full-text.txt

## About
Tom Osman explores the frontier of digital technologies.

## Content Sections
- About: Technology exploration and education
- Tools Inventory: Curated AI tools
- Livestreams: Daily Technology Dealer
- Blog: Long-form guides
- Portfolio: Selected work

## Keywords
digital technologies, AI, no-code, automation...

Comprehensive Metadata

I added extensive meta tags optimized for AI classification:

export const metadata = {
  keywords: [
    "Tom Osman",
    "digital technologies",
    "AI",
    "no-code",
    "automation",
    // ... more keywords
  ],
  other: {
    "ai-content": "educational",
    "ai-topic": "digital technologies, AI, automation",
    "ai-audience": "builders, developers, educators",
    "ai-use": "training,search,answer-generation,citation",
  },
};

Sitemap Optimization

Your sitemap now includes all content for comprehensive indexing:

  • Static pages (9 pages)
  • Blog posts (with publication dates)
  • Portfolio projects (7 projects)
  • Tools inventory (22 tools)

This ensures AI agents can discover and index all your content.

The Complete Bot List

Here's the complete list of bots I allowed (40+ total):

AI Agents

  • GPTBot, ChatGPT-User, OAI-SearchBot, OAI-ImageBot
  • ClaudeBot, Claude-Web, anthropic-ai
  • PerplexityBot, Perplexity-User
  • Google-Extended, Applebot-Extended
  • Bytespider, Amazonbot
  • YouBot, PhindBot, ExaBot, AndiBot
  • FirecrawlAgent, cohere-ai, AI2Bot
  • Grok-bot, academic-ai, Timpibot
  • ImagesiftBot, Kangaroo Bot, omgilibot, Diffbot

Social Platforms

  • Facebookbot, LinkedInBot, TwitterBot
  • SlackBot, TelegramBot, DiscordBot

Search & SEO

  • Bingbot, DuckDuckBot, SemrushBot
  • AhrefsBot, PetalBot, SeznamBot
  • Naverbot, YandexBot

Results

After implementing these changes:

  1. AI Visibility: Your site is now accessible to 40+ AI crawlers
  2. Knowledge Panels: Person schema increases Knowledge Panel potential
  3. Citation Ready: AI agents can cite and reference your content
  4. Training Data: Your content can be included in AI model training
  5. AI Search: Appears in Perplexity, ChatGPT, and other AI search results

Quick Start Checklist

Want to do the same for your site?

  1. Update robots.txt to allow AI bots (copy the list above)
  2. Add Person schema with your name, role, and links
  3. Create llms.txt explaining your content policies
  4. Generate llms-full-text.txt with site summary
  5. Add comprehensive meta tags with keywords
  6. Optimize sitemap to include all pages

The Future

As AI agents become the primary interface for information, being discoverable isn't optional—it's foundational to your online presence.

The work done today ensures that when someone asks an AI about "digital technologies" or "AI tools for builders," tomosman.com is part of the knowledge graph.

Research & References

This guide was created using insights from:

  • LLMS Central — Comprehensive guide to AI bot user-agents
  • Dark Visitors — Detailed AI bot profiles and documentation
  • Paul Calvano — Data-driven analysis of AI bot growth and adoption
  • Adnan Zameer — Practical implementation guide for robots.txt

Acknowledgments

Special thanks to:


Related Posts:

VIEW_TOOLS — Curated AI tools for your workflow


Implementing AI discovery for your site? Tell me—I'd love to help.