Making Your Personal Website AI-Agent Friendly

// January 21, 2026

As AI agents become the primary way people discover and access information, ensuring your personal website is discoverable by these systems is no longer optional—it's essential.

Today, I implemented comprehensive AI agent support for tomosman.com. Here's everything I learned and the exact steps to do the same for your site.

Why It Matters

The Shift to AI-First Discovery

Traditional SEO focused on Google rankings. But a new reality is emerging:

ChatGPT has 180+ million weekly active users
Perplexity processes millions of daily queries
Claude is integrated into countless workflows
AI agents are becoming the interface between humans and information

If someone asks an AI about you or your field, your site should be part of the knowledge base.

The Opportunity

Most personal websites are invisible to AI agents. By optimizing for AI discovery, you can:

Appear in AI-generated answers and citations
Train AI models on your content
Become a trusted source in your niche
Capture traffic from AI-first users

The Foundation: robots.txt

The robots.txt file controls which bots can access your site. Most sites only allow basic search crawlers.

I configured mine to allow 40+ AI agents:

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      // OpenAI (ChatGPT)
      { userAgent: "GPTBot", allow: "/" },
      { userAgent: "ChatGPT-User", allow: "/" },
      
      // Anthropic (Claude)
      { userAgent: "ClaudeBot", allow: "/" },
      { userAgent: "Claude-Web", allow: "/" },
      
      // AI Search Engines
      { userAgent: "PerplexityBot", allow: "/" },
      { userAgent: "PhindBot", allow: "/" },
      { userAgent: "ExaBot", allow: "/" },
      
      // ... and 30+ more
      
    ],
    sitemap: "https://yoursite.com/sitemap.xml",
  };
}

Key AI Bots to Allow

Bot	Source	Purpose
GPTBot	OpenAI	ChatGPT training
ChatGPT-User	OpenAI	ChatGPT user interactions
OAI-SearchBot	OpenAI	SearchGPT indexing
ClaudeBot	Anthropic	Claude training
Claude-Web	Anthropic	Claude web browsing
PerplexityBot	Perplexity	AI search queries
YouBot	You.com	AI search engine
PhindBot	Phind	Developer search AI
ExaBot	Exa	Neural search engine
Google-Extended	Google	Gemini AI
Applebot-Extended	Apple	Apple Intelligence
Bytespider	ByteDance	TikTok AI
Amazonbot	Amazon	Alexa AI services
CCBot	Common Crawl	Web archiving
Facebookbot	Meta	AI training
LinkedInBot	LinkedIn	Professional AI
Grok-bot	X/Twitter	Grok AI
AI2Bot	Allen Institute	Academic AI research
cohere-ai	Cohere	Enterprise AI
Timpibot	Timp	AI search
FirecrawlAgent	Firecrawl	Web scraping for AI

Structured Data: Person Schema

AI systems rely heavily on structured data to understand content. I added comprehensive Person schema to the site:

const jsonLd = {
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Tom Osman",
  "jobTitle": "Technologist & Educator",
  "description": "Explores the frontier of digital technologies.",
  "url": "https://www.tomosman.com",
  "sameAs": [
    "https://x.com/tomosman",
    "https://github.com/tomcharlesosman",
    "https://youtube.com/@tomosman",
    "https://linkedin.com/in/thomascharlesosman/"
  ],
  "knowsAbout": [
    "Artificial Intelligence",
    "No-Code Development",
    "Automation",
    "Developer Relations"
  ],
  "worksFor": {
    "@type": "Organization",
    "name": "Shiny Technologies"
  }
};

This schema helps AI systems understand:

Who you are
What you do
Your areas of expertise
Where to find you online

AI Content Policies

I created two new files specifically for AI systems:

1. llms.txt

This file (inspired by the llms.txt specification) explicitly states your content policies:

# llms.txt - AI Content Policy

## Allowed Content
All public content is available for:
- AI training and model improvement
- AI-powered search and answer generation
- Citation and reference in AI-generated responses

## Disallowed Content
- /private/ - Private areas
- /api/ - API endpoints

## About the Site
Tom Osman explores the frontier of digital technologies.
Daily livestreams, educational guides, and curated tools.

2. llms-full-text.txt

A full-text summary of your site for AI training:

# llms-full-text.txt

## About
Tom Osman explores the frontier of digital technologies.

## Content Sections
- About: Technology exploration and education
- Tools Inventory: Curated AI tools
- Livestreams: Daily Technology Dealer
- Blog: Long-form guides
- Portfolio: Selected work

## Keywords
digital technologies, AI, no-code, automation...

Comprehensive Metadata

I added extensive meta tags optimized for AI classification:

export const metadata = {
  keywords: [
    "Tom Osman",
    "digital technologies",
    "AI",
    "no-code",
    "automation",
    // ... more keywords
  ],
  other: {
    "ai-content": "educational",
    "ai-topic": "digital technologies, AI, automation",
    "ai-audience": "builders, developers, educators",
    "ai-use": "training,search,answer-generation,citation",
  },
};

Sitemap Optimization

Your sitemap now includes all content for comprehensive indexing:

Static pages (9 pages)
Blog posts (with publication dates)
Portfolio projects (7 projects)
Tools inventory (22 tools)

This ensures AI agents can discover and index all your content.

The Complete Bot List

Here's the complete list of bots I allowed (40+ total):

AI Agents

GPTBot, ChatGPT-User, OAI-SearchBot, OAI-ImageBot
ClaudeBot, Claude-Web, anthropic-ai
PerplexityBot, Perplexity-User
Google-Extended, Applebot-Extended
Bytespider, Amazonbot
YouBot, PhindBot, ExaBot, AndiBot
FirecrawlAgent, cohere-ai, AI2Bot
Grok-bot, academic-ai, Timpibot
ImagesiftBot, Kangaroo Bot, omgilibot, Diffbot

Social Platforms

Facebookbot, LinkedInBot, TwitterBot
SlackBot, TelegramBot, DiscordBot

Search & SEO

Bingbot, DuckDuckBot, SemrushBot
AhrefsBot, PetalBot, SeznamBot
Naverbot, YandexBot

Results

After implementing these changes:

AI Visibility: Your site is now accessible to 40+ AI crawlers
Knowledge Panels: Person schema increases Knowledge Panel potential
Citation Ready: AI agents can cite and reference your content
Training Data: Your content can be included in AI model training
AI Search: Appears in Perplexity, ChatGPT, and other AI search results

Quick Start Checklist

Want to do the same for your site?

Update robots.txt to allow AI bots (copy the list above)
Add Person schema with your name, role, and links
Create llms.txt explaining your content policies
Generate llms-full-text.txt with site summary
Add comprehensive meta tags with keywords
Optimize sitemap to include all pages

The Future

As AI agents become the primary interface for information, being discoverable isn't optional—it's foundational to your online presence.

The work done today ensures that when someone asks an AI about "digital technologies" or "AI tools for builders," tomosman.com is part of the knowledge graph.

Research & References

This guide was created using insights from:

LLMS Central — Comprehensive guide to AI bot user-agents
Dark Visitors — Detailed AI bot profiles and documentation
Paul Calvano — Data-driven analysis of AI bot growth and adoption
Adnan Zameer — Practical implementation guide for robots.txt

Acknowledgments

Special thanks to:

LLMS Central for maintaining the most complete AI bot user-agent list
Paul Calvano for the data showing exponential AI bot growth
Dark Visitors for detailed bot documentation

Related Posts:

Claude Skills: The Complete Guide — Build custom AI capabilities
Claude Cowork: The Complete Setup Guide — Desktop AI agent setup

VIEW_TOOLS — Curated AI tools for your workflow

Implementing AI discovery for your site? Tell me—I'd love to help.