Robots.txt Allow sub folder but not the parent











up vote
18
down vote

favorite
2












Can anybody please explain the correct robots.txt command for the following scenario.



I would like to allow access to:



/directory/subdirectory/..



But I would also like to restrict access to /directory/ not withstanding the above exception.










share|improve this question




























    up vote
    18
    down vote

    favorite
    2












    Can anybody please explain the correct robots.txt command for the following scenario.



    I would like to allow access to:



    /directory/subdirectory/..



    But I would also like to restrict access to /directory/ not withstanding the above exception.










    share|improve this question


























      up vote
      18
      down vote

      favorite
      2









      up vote
      18
      down vote

      favorite
      2






      2





      Can anybody please explain the correct robots.txt command for the following scenario.



      I would like to allow access to:



      /directory/subdirectory/..



      But I would also like to restrict access to /directory/ not withstanding the above exception.










      share|improve this question















      Can anybody please explain the correct robots.txt command for the following scenario.



      I would like to allow access to:



      /directory/subdirectory/..



      But I would also like to restrict access to /directory/ not withstanding the above exception.







      robots.txt






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 20 at 1:36









      Paolo

      9,8011263102




      9,8011263102










      asked Sep 30 '11 at 10:24









      QFDev

      4,345134474




      4,345134474
























          3 Answers
          3






          active

          oldest

          votes

















          up vote
          21
          down vote



          accepted










          Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt



          According to a Google groups post, the following works at least with GoogleBot;



          User-agent: Googlebot 
          Disallow: /directory/
          Allow: /directory/subdirectory/





          share|improve this answer

















          • 2




            I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
            – Duncanmoo
            Apr 12 '13 at 9:00






          • 4




            According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
            – pelms
            Jun 20 '13 at 14:17


















          up vote
          2
          down vote













          If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.



          You might try something like this in the code:



          if is_parent_directory_path
          <meta name="robots" content="noindex, nofollow">
          end





          share|improve this answer




























            up vote
            1
            down vote













            I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en



            You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.






            share|improve this answer





















            • Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
              – QFDev
              Feb 25 '16 at 17:43










            • QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
              – Moojjoo
              Mar 4 '16 at 21:38











            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f7609031%2frobots-txt-allow-sub-folder-but-not-the-parent%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            21
            down vote



            accepted










            Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt



            According to a Google groups post, the following works at least with GoogleBot;



            User-agent: Googlebot 
            Disallow: /directory/
            Allow: /directory/subdirectory/





            share|improve this answer

















            • 2




              I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
              – Duncanmoo
              Apr 12 '13 at 9:00






            • 4




              According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
              – pelms
              Jun 20 '13 at 14:17















            up vote
            21
            down vote



            accepted










            Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt



            According to a Google groups post, the following works at least with GoogleBot;



            User-agent: Googlebot 
            Disallow: /directory/
            Allow: /directory/subdirectory/





            share|improve this answer

















            • 2




              I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
              – Duncanmoo
              Apr 12 '13 at 9:00






            • 4




              According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
              – pelms
              Jun 20 '13 at 14:17













            up vote
            21
            down vote



            accepted







            up vote
            21
            down vote



            accepted






            Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt



            According to a Google groups post, the following works at least with GoogleBot;



            User-agent: Googlebot 
            Disallow: /directory/
            Allow: /directory/subdirectory/





            share|improve this answer












            Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt



            According to a Google groups post, the following works at least with GoogleBot;



            User-agent: Googlebot 
            Disallow: /directory/
            Allow: /directory/subdirectory/






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Sep 30 '11 at 10:38









            user967058

            305311




            305311








            • 2




              I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
              – Duncanmoo
              Apr 12 '13 at 9:00






            • 4




              According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
              – pelms
              Jun 20 '13 at 14:17














            • 2




              I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
              – Duncanmoo
              Apr 12 '13 at 9:00






            • 4




              According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
              – pelms
              Jun 20 '13 at 14:17








            2




            2




            I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
            – Duncanmoo
            Apr 12 '13 at 9:00




            I wanted to dynamically allow sub-directories but not that first level, changing the Allow statement to Allow: /directory/*/ works.
            – Duncanmoo
            Apr 12 '13 at 9:00




            4




            4




            According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
            – pelms
            Jun 20 '13 at 14:17




            According to the robots.txt Wikipedia entry, the 'Allow' directive should be placed before the 'Disallow' for maximum compatibility (though neither Google or Bing will mind)
            – pelms
            Jun 20 '13 at 14:17












            up vote
            2
            down vote













            If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.



            You might try something like this in the code:



            if is_parent_directory_path
            <meta name="robots" content="noindex, nofollow">
            end





            share|improve this answer

























              up vote
              2
              down vote













              If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.



              You might try something like this in the code:



              if is_parent_directory_path
              <meta name="robots" content="noindex, nofollow">
              end





              share|improve this answer























                up vote
                2
                down vote










                up vote
                2
                down vote









                If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.



                You might try something like this in the code:



                if is_parent_directory_path
                <meta name="robots" content="noindex, nofollow">
                end





                share|improve this answer












                If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.



                You might try something like this in the code:



                if is_parent_directory_path
                <meta name="robots" content="noindex, nofollow">
                end






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Aug 12 '13 at 16:25









                Javid Jamae

                5,91743457




                5,91743457






















                    up vote
                    1
                    down vote













                    I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en



                    You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.






                    share|improve this answer





















                    • Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                      – QFDev
                      Feb 25 '16 at 17:43










                    • QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                      – Moojjoo
                      Mar 4 '16 at 21:38















                    up vote
                    1
                    down vote













                    I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en



                    You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.






                    share|improve this answer





















                    • Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                      – QFDev
                      Feb 25 '16 at 17:43










                    • QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                      – Moojjoo
                      Mar 4 '16 at 21:38













                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en



                    You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.






                    share|improve this answer












                    I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en



                    You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Feb 25 '16 at 14:22









                    Moojjoo

                    451825




                    451825












                    • Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                      – QFDev
                      Feb 25 '16 at 17:43










                    • QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                      – Moojjoo
                      Mar 4 '16 at 21:38


















                    • Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                      – QFDev
                      Feb 25 '16 at 17:43










                    • QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                      – Moojjoo
                      Mar 4 '16 at 21:38
















                    Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                    – QFDev
                    Feb 25 '16 at 17:43




                    Good point! Not sure if that was available back in 2011 when I posted this, but it's a very useful addition to WMT.
                    – QFDev
                    Feb 25 '16 at 17:43












                    QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                    – Moojjoo
                    Mar 4 '16 at 21:38




                    QFDEV I had to use the Robot tester today because I am working really hard to get our companies site ranking higher on Google's Search Results. And the only thing I see under "HTML Improvements" on are duplicate titles and meta tags. This is because they are reading the same pages twice (query strings). Also for some reason the robot is crawling directories that do not exist. I found your post, which helped and then noticed the tester in Google Web Master tools and saw that it would validate the changes. Thought it could help other developers by posting to your thread.
                    – Moojjoo
                    Mar 4 '16 at 21:38


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f7609031%2frobots-txt-allow-sub-folder-but-not-the-parent%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Costa Masnaga

                    Fotorealismo

                    Sidney Franklin