Robots.txt for my woltlab suite website

  • Why not using the one from WoltLab?


    Code
    User-agent: *
    Disallow: /search/
    Disallow: /tagged/

    This sounds pretty valid to me. :)

    return null;


    Browser: Firefox Nightly (64bit)

    Betriebssystem: Windows 10

  • Should there be instructions in the Robots.txt file to block the calendar and the ACP?


    I've read that Google spends ages crawling calendar pages that use up unnecessary resources.


    Many thanks

    Jupiter

    I am a Newbie Admin. Please be gentle, I don't understand technical things.
    (Please can we have a full manual for this software)

  • Should there be instructions in the Robots.txt file to block the calendar and the ACP?

    I don't know why you want to block the calendar, but the ACP doesn't make so much advantage, because google won't index it, because it has a meta tag set, that basically says that crawlers should not index the page.

    return null;


    Browser: Firefox Nightly (64bit)

    Betriebssystem: Windows 10

  • Thanks Jens. I'll stick to what WoltLab uses for its robots.txt file as you helpfully illustrated.

    I am a Newbie Admin. Please be gentle, I don't understand technical things.
    (Please can we have a full manual for this software)

  • So will this be the original BB robots.txt?

    Code
    User-agent: *
    Disallow: /search/
    Disallow: /tagged/

    Thanks!

    Testing this Forum Script (Trial) :)

  • I would definitly add the line for the sitemap, if you are on 3.1:


    Code
    Sitemap: https://URL-TO-WOLTLAB-ROOT/sitemaps/sitemap.xml

    Viele Grüsse aus Stuttgart, Kind Regards from Stuttgart
    TheSonic

  • Any folders like Cache that has an htaccess file in blocking all access to the folder, you don't need to add those folders to a robots.txt file because bots are already blocked from any that contain one. And WBB has it pretty well covered with an htaccess file added in a few folders blocking bots they (or anyone else) shouldn't be browsing. So there's not much need adding much of anything to a robots.txt file apart from pointing Google to the Sitemap file.


    I'm surprised WBB doesn't now include a robots.txt file with 3.1 doing that actually, pointing google to the sitemap location.

  • You could add a crawl delay which some bots like yahoo and google will follow and will help a bit with not getting hammered by too many bots at once that follow it, I used this on my site.


    Code
    User-agent: *
    Crawl-delay: 10
  • I see,


    This is what i have right now:


    Code
    User-agent: *
    Crawl-delay: 10
    Disallow: /search/
    Disallow: /tagged/
    Sitemap: https://www.mysite.com/sitemaps/sitemap.xml

    Testing this Forum Script (Trial) :)