#mod_headers
Explore tagged Tumblr posts
Text
Optimización de Caché en Apache2
En este tutorial, se explorará cómo optimizar el rendimiento de un servidor Apache2 implementando técnicas avanzadas de caché. Se detallará el uso estratégico de los módulos mod_deflate, mod_headers y mod_expires para mejorar la entrega de contenido estático. A través de una configuración cuidadosa y la combinación de estos módulos, se demostrará cómo se puede aumentar la eficiencia y la…

View On WordPress
0 notes
Text
pragma online
pragma online
———————————————————
>>> Получить файл <<<
——————————————————— Все ок! Администрация рекомендует ———————————————————
Наш онлайн-переводчик сделает текст на любом языке понятным и доступным. Используйте англо-русский или русско-казахский переводчик или настройте перевод с любого языка абсолютно бесплатно! ## Русско-латышский переводчик онлайн. Латышско-русский Следите, чтобы заголовок Last-Modified соответствовал реальной дате измен��ния содержимого. Не пересохраняйте файлы и страницы, если не собираетесь их менять. ### HTTP-кэширование — Exlab Прежде ч��м отправить запрос по какому-либо URL, браузер проверяет наличие требуемого объекта в собственном кэше, и если таковой имеется, используется он. В противном случае запрос отправляется до следующего сервера, где так же проверяется кэш. Если ни на одном из промежуточных серверов соответствия запросу не найдено, то в конце концов он достигает сервера-источника и отклик приходит оттуда. Стоит заметить, что прокси и шлюзы используются множеством пользователей, и потому их кэш называется публичным, или разделяемым. #### Программы для Windows Скачать бесплатно на русском языке Мы постоянно обновляем словарную базу, учитывая новые словообразования и все лексические нюансы языков Вы можете не переживать, что наш переводчик исказит исходный смысл текста или не справится с новыми терминами. Выше уже упоминалось, что клиентский запрос может включать заголовки с информацией о закэшированном отклике. Получив такие заголовки, сервер может проверить его свежесть и принять решение о необходимости отправить новый отклик либо сообщить клиенту, что можно использовать кэшированный. Процесс проверки называется валидацией, а сами заголовки валидаторами. Модуль mod_headers позволяет управлять любыми заголовками, включая установку прочих директив Cache-Control. Команды обоих модулей прописываются в конфигурации Apache или в файле .htaccess , если такая возможность включена администратором. Например так: SiteMap можно менять и с помощью кода, что может пригодиться при установки решения через инсталяционный пакет.<> <> <> 6 Проверяйте орфографию<> 7 Правильно расставляйте знаки препинания<> 8 Обращайте внимание на диакритику<> 9 Соблюдайте регистр<> 5 Используйте простые синтаксические конструкции с прямым порядком слов<> 6 Избегайте пропуска служебных слов<> 7 Не используйте сокращения<> 8 Не используйте жаргон<> 9 Попробуйте использовать синонимы Код 859 сообщает, что изменений в документе не было. Получив его, клиент использует данные из кэша. Таким образом, запрос все же достигает сервера, однако отклик ограничивается лишь заголовками. Для более точного перевода Вы можете настроить тематику Вашего текста выбрать из четырех предлагаемых категорий: общий перевод, юридический, технический, компьютерный или спортивный.
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Setup WHM & cPanel on Google Cloud with Mod_PageSpeed
https://lemacksmedia.com/news/08/20/setup-whm-cpanel-on-google-cloud-with-mod_pagespeed/
Setup WHM & cPanel on Google Cloud with Mod_PageSpeed
Setting up Google Cloud for WHM & cPanel
So you’re looking for the resources to setup WHM & cPanel on Google Cloud? Well, we have that as well as how to setup Mod_PageSpeed (read about PageSpeed Filters)! This is not an all encompassing article, there are some items omitted that require even more advanced skills. If you’re a business owner looking to save money but have no IT or Web experience, we can help you get started, or help your IT team, we do not recommend trying this on a production site unless you are confident in what you’re doing. Skip to: Preparing Server | Installing WHM | Mod_PageSpeed Setup
Google Cloud Documentation: https://cloud.google.com/compute/docs/
Create Project
Go to https://console.cloud.google.com/project and Sign In
Click Create Project
Type [PROJECTNAME]
Click Create
Add any other people you need to work on the project through IAM
Create Instance
Click [PROJECTNAME]
Click Menu (3 bars) >> Compute Engine
Click Create Instance
Enter the following:phg
Name: [INSTANCENAME]
Zone: (whatever is close to you)
Machine Type: [MACHINETYPE]
Click Change in Boot disk and choose:
CentOS 7
Boot disk type: SSD persistent disk
Size: [DISKSIZE] (we recommend at least 80GB)
In Firewall
tick Allow HTTP traffic
tick Allow HTTPS traffic
Click Create
Make a note of the IP address in the project variables list
Reserve IP Address
Click Menu (3 bars) >> VPC network >> External IP addresses
Click Ephemeral on the row that has VM instance [INSTANCENAME]
Select Static
In Name enter [INSTANCENAME]
Click Reserve
Create Firewall Rules
In Google Cloud click Firewall Rules
Click CREATE FIREWALL RULE
In Name enter cpanel
In Source filter enter Allow from any source (0.0.0.0/0)
In Allowed protocols and ports enter (you’ll want to close unused ports after setup): tcp:20; tcp:21; tcp:22; tcp:25; tcp:26; tcp:37; tcp:43; tcp:53; udp:53; tcp:80; tcp:110; tcp:113; tcp:143; tcp:443; tcp:465; udp:465; tcp:587; tcp:783; udp:783; tcp:873;udp:873; tcp:993; tcp:995; tcp:2073; tcp:2077; tcp:2078; tcp:2079; tcp:2080; tcp:2082; tcp:2083; tcp:2086; tcp:2087; tcp:2089; tcp:2095; tcp:2096; tcp:2525; udp:2525; tcp:3306; udp:50000-60000;tcp:50000-60000
Click Create
Preparing Server on SSH
Update Root Password
Click SSH button on the instance row
sudo su -
passwd
Using a random password generator, generate a new password, copy it, and make a note of it
Paste the password and press enter (it will look like there has been nothing pasted in, still press enter)
Paste the password again and press enter
Add the password to your password management
Installing screen
yum install screen
y
Change Hostname
Change Hostname: hostname [SERVERNAME]
Create A Record
Add an A record to your DNS management system for [SEVERNAME] with the IP address which can be found on Google Cloud under your project instance
Installing Cloud Linux (Optional)
Buy Cloudlinux License
Go to https://cln.cloudlinux.com/
Downloading Cloudlinux
On SSH enter the following:
wget https://repo.cloudlinux.com/cloudlinux/sources/cln/cldeploy
sh cldeploy -k [CLOUDLINUX KEY]
On completion, on SSH enter the following:
reboot
Installing WHM
WHM Documentation: https://documentation.cpanel.net/
Buy WHM License from cPanel
Go to https://www.buycpanel.com (choose VPS license not dedicated, you may also just use the free trial for now)
Downloading WHM
On SSH enter the following (Step 2 will take a fair amount of time, be patient):
screen
cd /home && curl -o latest -L https://securedownloads.cpanel.net/latest && sh latest
Continue to Step 4 unless Step 2 failed, then:
systemctl stop NetworkManager.service
systemctl disable NetworkManager.service
systemctl enable network.service
systemctl start network.service
Return to Step 2
/usr/local/cpanel/cpkeyclt
/usr/local/cpanel/bin/checkallsslcerts
Initial WHM Setup
Navigate to http://[SERVERNAME]:2087 (you might have to initially navigate to https://[SERVERIP]:2087 , just note your IP in Google Cloud Engine)
Login using root as the username
Step 1: Click I Agree/Go To Step 2
Step 2:
Enter your chosen email as the Server Contact Email Address (e.g. sys@[SERVERTLD])
Enter 8.8.8.8 as the Primary Resolver
Enter 8.8.4.4 as the Secondary Resolver
Enter 1.1.1.1 as the Tertiary Resolver (optional)
Click Save & Go to Step 3
Click Skip This Step and Use Default Settings
Click Save & Go to Step 5
Click Skip This Step and Use Default Settings
Click Finish Setup Wizard
Click Go to WHM
Click Save Settings
Configuring WHM & cPanel Settings
Configure Apache
Go to https://[SERVERNAME]:2087 and login as root
Go to Software >> EasyApache 4
Click Customize
Make sure the following is ticked.
Click Apache Modules
mod_bwlimited
mod_cache
mod_cache_disk
mod_cache_socache
mod_cgi
mod_cpanel
mod_data
mod_dav (optional)
mod_dav_fs (optional)
mod_deflate
mod_env
mod_expires
mod_file_cache
mod_headers
mod_imagemap
mod_mime_magic
mod_mpm_prefork
mod_proxy
mod_proxy_fcgi
mod_proxy_ftp
mod_proxy_html
mod_proxy_http
mod_proxy_scgi
mod_proxy_wstunnel
mod_security2
mod_socache_memcache
mod_ssl
mod_suexec
mod_suphp
mod_unique_id
mod_version
Click PHP Versions
php56 (Optional, don’t use if you don’t have any 5.x sites currently)
php70
php71
php72
Click Php Extensions ( Make sure to tick all versions of each extensions )
libc-client
pear
php-bcmath
calendar
cli
common
curl
devel
fileinfo
fpm
ftp
gd
iconv
imap
ioncube10
litespeed (optional for LSWS)
mbstring
mcrypt
mysqlnd
pdo
posix
soap
sockets
xml
zendguard
zip
intl
runtime
Click Review
Save as Profile (for use in deploying another server later, or restoring the current one)
Click Provision
Click Done
Apache Config File Optimizations
Go to Service Configuration >> Apache Configuration
Click Include Editor
Under “Post VirtualHost Include” select “All versions” from the dropdown
In the text area paste the following code:
## EXPIRES CACHING ## <IfModule mod_expires.c> # Enable expirations ExpiresActive On # Default directive ExpiresDefault "access plus 1 month" # My favicon ExpiresByType image/x-icon "access plus 1 year" # Images ExpiresByType image/gif "access plus 1 month" ExpiresByType image/png "access plus 1 month" ExpiresByType image/jpg "access plus 1 month" ExpiresByType image/jpeg "access plus 1 month" # CSS ExpiresByType text/css "access plus 1 month" # Javascript ExpiresByType application/javascript "access plus 1 year" # PDF ExpiresByType application/pdf "access plus 1 month" # Flash ExpiresByType application/x-shockwave-flash "access plus 1 month" </IfModule> ## EXPIRES CACHING ## ## ENABLE GZIP COMPRESSION ## <IfModule mod_deflate.c> # Compress HTML, CSS, JavaScript, Text, XML and fonts AddOutputFilterByType DEFLATE application/javascript AddOutputFilterByType DEFLATE application/rss+xml AddOutputFilterByType DEFLATE application/vnd.ms-fontobject AddOutputFilterByType DEFLATE application/x-font AddOutputFilterByType DEFLATE application/x-font-opentype AddOutputFilterByType DEFLATE application/x-font-otf AddOutputFilterByType DEFLATE application/x-font-truetype AddOutputFilterByType DEFLATE application/x-font-ttf AddOutputFilterByType DEFLATE application/x-javascript AddOutputFilterByType DEFLATE application/xhtml+xml AddOutputFilterByType DEFLATE application/xml AddOutputFilterByType DEFLATE font/opentype AddOutputFilterByType DEFLATE font/otf AddOutputFilterByType DEFLATE font/ttf AddOutputFilterByType DEFLATE image/svg+xml AddOutputFilterByType DEFLATE image/x-icon AddOutputFilterByType DEFLATE text/css AddOutputFilterByType DEFLATE text/html AddOutputFilterByType DEFLATE text/javascript AddOutputFilterByType DEFLATE text/plain AddOutputFilterByType DEFLATE text/xml # Remove browser bugs (only needed for really old browsers) BrowserMatch ^Mozilla/4 gzip-only-text/html BrowserMatch ^Mozilla/4\.0[678] no-gzip BrowserMatch \bMSIE !no-gzip !gzip-only-text/html Header append Vary User-Agent </IfModule> ## ENABLE GZIP COMPRESSION ##
Click Update
Click Restart Apache
Configuring PHP
Go to Software >> MultiPHP INI Editor
Under Select PHP Version go through each version and configure the following:
allow_url_fopen = Enabled
Ignore: max_execution_time = 360
Ignore: max_input_time = 180
Ignore: memory_limit = 512M
Ignore: upload_max_filesize = 256M
Ignore: In “Editor Mode” post_max_size = 256M
In “Editor Mode” always_populate_raw_post_data = -1
Click Save
Disable Compiler
Go to Security Center >> Compiler Access
Click Disable Compilers
Configure open_basedir Fix
Go to Security Center >> PHP open_basedir Tweak
Tick Enable php open_basedir Protection.
Click Save
Configure Shell Fork Bomb Protection
Go to Security Center >> Shell Fork Bomb Protection
Click Enable Protection
Disable Traceroute
Go to Security Center >> Traceroute Enable/Disable
Click Disable
Allow SMTP on Port 2525
Go to Service Configuration >> Service Manager
Tick both boxes next to Exim Mail Server (on another port) to 360
Change Allow exim to listen on a port other than 25. to 2525
Click Save
Install ClamAV and Munin
Go to cPanel >> Manage Plugins
Click Install ClamAV for cPanel
Click Install Munin for cPanel
Default Show All on List Accounts
Go to Server Configuration >> Tweak Settings
Click Display
Number of accounts per page to display in “List Accounts”. = All
Click Save
Prevent “nobody” from sending mail & Disable Horde and Squirrel
Go to Server Configuration >> Tweak Settings
Click Mail
Prevent “nobody” from sending mail = On
Enable Horde Webmail = Off
Enable Mailman mailing lists = Off
Enable Roundcube webmail = Off
Enable SquirrelMail webmail = Off (removed from WHM/cPanel in version 78)
Click Save
Restrict Spam on Server
Go to Service Configuration >> Exim Configuration Manager
Under the RBLs section:
Click On for RBL: bl.spamcop.net
Click On for RBL: zen.spamhaus.org
Under the Apache SpamAssassin™ Options section
Click On for Apache SpamAssassin™: Forced Global ON
Click On for Scan outgoing messages for spam and reject based on the Apache SpamAssassin™ internal spam_score setting
Click On for Do not forward mail to external recipients if it matches the Apache SpamAssassin™ internal spam_score setting
Click Save
Change hostname
Go to Networking Setup >> Change Hostname
In New Hostname enter: [FQDN] (your fully qualified domain name)
Click Change
Edit cPanel default Quota Plan
Go to Packages >> Edit a Package
Click default
Click Edit
Change the following:
Disk Quota (MB): 5000
Monthly Bandwidth (MB): 100000
Max FTP Accounts: 5
Max Email Accounts: 0
Max Email Lists: 0
Max Databases: 1
Max Subdomains: 5
Max Parked Domains: 5
Max Addon Domains: 5
Maximum Hourly Email by Domain Relayed: 250
Maximum percentage of failed or deferred messages a domain may send per hour: 250
Click Save Changes
Graceful Server Reboot
Go to System Reboot >> Graceful Server Reboot
Click Proceed
Mod_PageSpeed Setup
On SSH enter the following:
yum install rpm-build cpio ea-apache24-mod_version
wget https://github.com/pagespeed/cpanel/raw/master/EA4/ea-apache24-mod_pagespeed-latest-stable.src.rpm
rpmbuild --rebuild ea-apache24-mod_pagespeed-latest-stable.src.rpm
rpm -ivh /root/rpmbuild/RPMS/x86_64/ea-apache24-mod_pagespeed*.rpm
/etc/init.d/apache2 restart or service httpd restart
Note: 1- if you get following error on Step 3 : “RPM build errors: File must begin with “/”: %_httpd_moddir/*.so File must begin with “/”: %_httpd_modconfdir/*.conf”
Just create a file named “macros.apache2” in ‘/etc/rpm/ directory and paste the below content into that and then restart from step 3.
%_httpd_mmn 20120211x8664 %_httpd_apxs /usr/bin/apxs %_httpd_dir /etc/apache2 %_httpd_bindir %_httpd_dir/bin %_httpd_modconfdir %_httpd_dir/conf.modules.d %_httpd_confdir %_httpd_dir/conf.d %_httpd_contentdir /usr/share/apache2 %_httpd_moddir /usr/lib64/apache2/modules
The installation script will copy “pagespeed.conf” file into ” /usr/local/apache/conf/ ” or ” /etc/apache2/conf.modules.d” on your server. Documentation for Mod_PageSpeed can be found here.
Our article on .htaccess Mod_PageSpeed filters can be found here.
Check back with us for more updates, and the videos coming wednesday! https://lemacksmedia.com
Visit Lemacks Media https://lemacksmedia.com/news/08/20/setup-whm-cpanel-on-google-cloud-with-mod_pagespeed/ for updates and more content. #CloudLinux #CPanel #Google #GoogleCloud #GoogleCloudCompute #GoogleCloudComputeEngine #GoogleCloudPlatform #GoogleComputeEngine #ModPageSpeed #ModPageSpeed #PageSpeed #WHM
#CloudLinux#cPanel#Google#Google Cloud#Google Cloud Compute#Google Cloud Compute Engine#Google Cloud Platform#Google Compute Engine#Mod PageSpeed#Mod_PageSpeed#PageSpeed#WHM#How-To
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes
Text
Preventing your site from being indexed, the right way
Preventing your site from being indexed, the right way
We’ve said it in 2009, and we’ll say it again: it keeps amazing us that there are still people using just a robots.txt files to prevent indexing of their site in Google or Bing. As a result their site shows up in the search engines anyway. You know why it keeps amazing us? Because robots.txt doesn’t actually do the latter, even though it does prevents indexing of your site. Let me explain how this works in this post.
For more on robots.txt, please read robots.txt: the ultimate guide.
Become a technical SEO expert with our Technical SEO 1 training! »
$ 199€ 199 - Buy now » Info There is a difference between being indexed and being listed in Google
Before we explain things any further, we need to go over some terms here first:
Indexed / Indexing The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to its “index”.
Ranking / Listing / Showing Showing a site in the search result pages (aka SERPs).
So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points to a page, domain or wherever, Google follows that link. If the robots.txt on that domain prevents indexing of that page by a search engine, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at. In the old days, that could have been DMOZ or the Yahoo directory, but I can imagine Google using, for instance, your My Business details these days, or the old data from these projects. There are more sites that summarize your website, right.
Now if the explanation above doesn’t make sense, have a look at this 2009 Matt Cutts video explanation:
youtube
If you have reasons to prevent indexing of your website, adding that request to the specific page you want to block like Matt is talking about, is still the right way to go. But you’ll need to inform Google about that meta robots tag. So, if you want to effectively hide pages from the search engines you need them to index those pages. Even though that might seem contradictory. There are two ways of doing that.
Prevent listing of your page by adding a meta robots tag
The first option to prevent listing of your page is by using robots meta tags. We’ve got an ultimate guide on robots meta tags that’s more extensive, but it basically comes down to adding this tag to your page:
<meta name="robots" content="noindex,nofollow>
The issue with a tag like that is that you have to add it to each and every page.
Or by adding a X-Robots-Tag HTTP header
To make the process of adding the meta robots tag to every single page of your site a bit easier, the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header called X-Robots-Tag and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. If your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this would have the effect that that entire site can be indexed. But would never be shown in the search results.
So, get rid of that robots.txt file with Disallow: / in it. Use the X-Robots-Tag or that meta robots tag instead!
Read more: ‘The ultimate guide to the meta robots tag’ »
http://ift.tt/2ahuoUC
0 notes