Using robots.txt to block URLs with *

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,132
Points
63
Can you tell me how to block this type of URL using robots.txt

example.com/topic/this-is-web-page-1/reply
example.com/topic/this-is-web-page-2/reply
example.com/topic/this-is-web-page-3/reply
example.com/topic/this-is-web-page-4/reply

I mean it will block URLs with end of "reply"

but still allowing Google spiders to crawl these pages

example.com/topic/this-is-web-page-1/
example.com/topic/this-is-web-page-2/
example.com/topic/this-is-web-page-3/
example.com/topic/this-is-web-page-4/

What is your solution?
 

Rob Whisonant

Moderator
Joined
May 24, 2016
Messages
2,489
Points
113
One thing to keep in mind when using wildcards in robots.txt files. Not all crawlers and bots know how to interpret wildcards.
 

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,132
Points
63
Marc van Leeuwen
I understand because wildcards will only work with Google or a few search engines know it.
I tested and found out the answer for my question above, just putting this code into robots.txt and it worked.
Code:
Disallow: /threads/*/reply
 
Latest threads
Replies
0
Views
591
Replies
2
Views
791
Recommended threads
Replies
0
Views
6,530
Replies
7
Views
1,930
Replies
12
Views
5,961

Referral contests

Referral link for :

Sponsors

Popular tags

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top