Description
Feature Description
Gitea is a magnet for search engines, which once they find an instance are very happy to follow all the links on the site, of which there are many, resulting in never ending indexer bot traffic. Among the links followed are UI buttons (star a page, sort by XYZ, select a UI language...), as well as pages that are expensive to render, but don't provide much value once indexed (blame, compare, commit, ...).
Ideally, these would not be (attempted to be) indexed.
I tried to accomplish this on my site via a robots.txt
along the following lines, but was not exactly successful, probably because many bots don't understand the wildcard syntax:
User-agent: *
Disallow: /
Allow: /whitelisted-user
Disallow: /*/raw
Disallow: /*/commit
Disallow: /*/blame
Disallow: /*/src
Disallow: /*?lang=*
A better approach would be to render most links with the rel="nofollow"
attribute. I'd argue this could be applied to all links, except for links to
- landingpage
- user / org
- repo
- issue(s) / pr(s) / release(s) / wiki / yougettheidea..
Screenshots
No response