#Asynchronous Javascript & XML
Explore tagged Tumblr posts
Text
Using AJAX with Rails
Using Rails you can sometimes face some tricky situations. For example, the normal flow of visiting a website is that you load a page and if you want to see new information you have to either reload the page to update it or click a link to visit a different page. But what if you don’t want this p...
#Ajax#AJAX with Rails#Asynchronous Javascript & XML#backend#ERL#html#javascript#js#JS frameworks#JSON#JSON API#rails#Rails community#ruby#Ruby app#ruby on rails#RubyGuides#XML
0 notes
Text
What are StudioPress Sites?
StudioPress Sites are built on the WordPress software platform, but they are different in the fact that they don’t reside within WordPress.com and they don’t have to be hosted with WordPress.org software at a third party hosting company. Say what?
In other words, StudioPress Sites have the power of WordPress but they don’t require the maintenance and upkeep associated with self hosted websites or blogs.
This distinction is important because it means StudioPress is providing a hosted version of the Genesis Framework. So users will have all of the power of Genesis, but won’t require any of the upkeep associated with running their own website.
This new website builder is built on my beloved Genesis Framework and it comes with a lot of built-in options with the flexibility of adding your own themes and plugins.
Top Genesis Themes Ready for Install
As part of the service, users can load one of twenty popular Genesis themes for free. This means you do not need to purchase or maintenance additional software licenses. With their built-in theme library you can install any of the following themes for free and with just one click:
Altitude Pro
Atmosphere Pro
Author Pro
Brunch Pro
Daily Dish Pro
Digital Pro
Executive Pro
Foodie Pro
Gallery Pro
Infinity Pro
Lifestyle Pro
Magazine Pro
Maker Pro
Metro Pro
News Pro
Parallax Pro
Showcase Pro
Smart Passive Income Pro
Wellness Pro
Workstation Pro
What makes StudioPress Sites different than the Rainmaker Platform is you are not restricted on the themes you use or the customizations you make. You can upload any Genesis theme and customize it was needed.
One-Click Install of Top WordPress Plugins
I love Genesis because it already offers great options like top notch speed, strong security, HTML5 coding, and oodles of great built-in features. Now combine all this power with the ability to have all of this hosted for you with a one-click install of popular plugins like:
AffiliateWP
AMP
Beaver Builder Lite
Design Palette Pro
Easy Digital Downloads
OptinMonster
Ninja Forms
Restrict Content Pto
Soliloquy Lite
WooCommerce
WPFirms Lite
OptinMonster
And you always have the option of upgrading to the premium version of Beaver Builder, Design Palette Pro, OptinMonster, Restrict Content Pro, or Soliloquy.
Built-in SEO (Search Engine Optimization) Functionality
If you know me well, you know my first concern with this entire roll out would be SEO. But I don’t have to fret because the Rainmaker Digital team is promising a lot of SEO functionality without a lot of work. StudioPress Sites offers some impressive SEO features such as advanced schema control, XML sitemap generation, robots.txt generation, asynchronous JavaScript loading, enhanced Open Graph output, and breadcrumb title control.
While I’m not a fan of themes controlling SEO, I know the reality of WordPress is that a lot of users fail to ever install SEO plugins. This is a big issue and it equates to many WordPress users missing out on critical SEO features and functionality. For those who fall into that bucket, StudioPress Sites SEO works and works well.
What impresses me more is SEO consultants (like myself) are not restricted to StudioPress’ SEO features. I can still add populate SEO plugins like Yoast SEO or All in One SEO. This would have been a deal breaker for me, because I heavily use and rely on the Yoast SEO plugin.
With the current set up of StudioPress Sites the Rainmaker Digital team caters to both novice users who don’t know a lot about SEO and advanced users who require very robust SEO plugins like Yoast.
Support for Google’s AMP
What absolutely floored me is the support for AMP. AMP is Google’s new mobile friendly endeavor for speeding up the web for mobile devices. Google is pushing more and more towards mobile usability so AMP is going to be a must have feature in the coming months. It appears the StudioPress development team knows this and they are already preparing for it.
StudioPress sites offers the WordPress AMP plugin with a one-click install and you can augment that functionality by adding in Yoast’s plugin for AMP. Between these two options, you’ll be Google friendly and ready to take on 2017’s changing SEO landscape.
Who are StudioPress Sites Geared Towards?
According to the Rainmaker Digital team, they are perfect for bloggers, podcasters, and affiliate marketers. These websites would also work well for people who are selling physical products, digital downloads, and membership programs.
Looking at that another way, StudioPress Sites would work great for someone who just wants to sell something online and doesn’t want to hassle associated with any of the software updates, hosting requirements, or even time involved in reviewing and looking for solutions.
That means these websites are designed for entrepreneurs who are working independently and don’t have internal tech support or even large budgets. The starting price for StudioPress Sites is $27 per month. That is a significant savings when you compare this all-in-one price to the combine cost of hosting, themes, plugins, and updates.
I would quickly add that these websites are designed for micro businesses or solopreneurs and their usage is not intended for mid-market or enterprise companies. While StudioPress Sites would technically work for larger organizations, I don’t see those types of website owners running to StudioPress Sites and adopting this new offering.
0 notes
Text
How to Add Infinite Scroll to your WordPress Site (Step by Step)
Attain it is good to want in order that it is good so as to add tons of scroll to your wordpress weblog?
Plenty of scroll is an internet create methodology which routinely a whole lot your subsequent web page thunder materials when prospects attain the bottom of a web page. It permits prospects to peek further thunder materials in your weblog with out clicking on the pagination or ‘Load Extra’ button.
Listed proper right here, we’ll level to you straightforward strategies to with out dispute add tons of scroll in your wordpress weblog (step-by-step).
What’s Plenty of Scroll?
Plenty of scroll is an internet create pattern which makes say of AJAX web page load apart from numeric web page navigation to routinely load your subsequent web page thunder materials and show veil it on the spoil of current web page. Plenty of scrolling makes it extra easy to browse further thunder materials by merely scrolling down. It a whole lot thunder materials constantly and infinitely as prospects help scrolling.
Historically, prospects would should click on on on ‘subsequent,’ ‘outdated’ buttons or web page numbers to peek older weblog articles.
When the say of AJAX speedy for Asynchronous Javascript and XML, webpages can talk with the server with out reloading the total web page. It permits internet apps to course of explicit particular person requests and produce data with out refresh.
Probably the most super exAMPles of tons of scroll create are the standard social media web pages lots like Fb, Twitter, Instagram, Pinterest, and additional. Should that you’d be succesful to furthermore very properly be the say of any of them, then you definately definately understand how thunder materials a whole lot with out spoil in your social media timeline.
Is Plenty of Scrolling Appropriate for Each Net residing?
Many thunder materials web pages on the data superhighway are the say of tons of scroll methodology to toughen the having a look expertise and develop engagement. This leads many inexperienced persons to quiz the questions like is it factual for each internet web page, or is it appropriate for my residing?
Plenty of scrolling is aesthetic for web pages that current thunder materials in a timeline or feed, just like the social media apps. It gives a true-looking having a look expertise to the purchasers shopping for for time-essentially primarily primarily based thunder materials.
Subsequent, tons of scroll create is expedient for cellular and contact items. For cellular prospects, scrolling is further explicit person-high high quality than taping on tiny web page hyperlinks.
The largest revenue of tons of scroll methodology is a young having a look expertise. Prospects don’t wish to click on on on the pagination hyperlinks manually. Reveal materials a whole lot swiftly with scrolling and retains prospects engaged.
However on the diversified side, it’s going to furthermore originate your residing navigation further tough. With steady loading, it is robust to decide out the put (on which web page) a weblog article is. Some prospects even come by it overwhelming to peek many articles unexpectedly.
One different drawback of tons of scroll create is you can not grasp a footer. Even when you’ve got it, this might perhaps furthermore be hidden underneath endless articles constantly loading. Many web pages add predominant hyperlinks on the footer, so now not having it’s going to disappoint some prospects.
Perchance presumably essentially the most referring to cringe with tons of scrolling is that it’ll unhurried down your internet web page and even smash the server. After we added tons of scroll to 1 of our smaller blogs, we skilled server smash after prospects scrolled unnecessarily inflicting memory exhaustion on a small wordpress website hosting yarn. We counsel that you just say managed wordpress website hosting whenever you’re having a look to check out tons of scrolling.
Now that you already know the professionals and cons of tons of scroll create, that you’d be succesful to resolve whether or not you’ve got bought bought in order that it is good so as to add it to your weblog or now not.
Should you’ve got bought bought decided in order that it is good so as to add tons of scrolling to your wordpress weblog, then you definately definately can attain it with out dispute. We will be exhibiting you a number of plugins which that you’d be succesful to say, so that you’d be succesful to resolve the one that the majority super fits your needs.
Including Plenty of Scroll to Your wordpress Weblog with Recall Plenty of Scroll
First dispute you’ve got bought bought to realize is set up and set off the Recall Plenty of Scroll plugin. For further small print, watch our step-by-step handbook on straightforward strategies to place in a wordpress plugin.
Upon activation, wordpress will add a model modern menu merchandise labeled ‘Recall Plenty of Scroll’ to your wordpress admin panel. It’s most likely you may perhaps presumably wish to click on on on it to configure the plugin settings.
First, that you’d be succesful to resolve a set off chance for loading articles. The plugin helps you to load thunder materials routinely as prospects scroll down the web page or add a ‘Load Extra’ button.
It’s most likely you may perhaps presumably peaceable rating out the ‘Scroll’ chance to set off autoload with scrolling. Alternatively, that you’d be succesful to rating out the ‘Click on on’ chance when it is good to want in order that it is good so as to add a load further button.
Subsequent, that you’d be succesful to look at the navigation selector, subsequent selector, thunder materials selector, and merchandise selector. You don’t wish to commerce one factor on these alternate selections as a result of default alternate selections work fairly properly.
After that, there may be an Picture chance the put that you’d be succesful so as to add thunder materials loader icon. By default, it has a loader GIF picture. Should you’ve got bought bought a much bigger picture, then you definately definately can commerce it.
‘Attain Textual content’ chance entails the message that may very well be proven as soon as a selected particular person completes viewing your articles. By default, the textual content says ‘No further objects to show veil.’ You presumably can with out dispute edit this textual content as needed.
As soon as accomplished, click on on on the Place Adjustments button.
That’s it! The tons of scrolling is now filled with life in your weblog. You presumably can discuss over alongside along with your weblog and watch the tons of scrolling in motion.
Different wordpress Plugins to Add Plenty of Scroll in wordpress
Recall Plenty of Scroll works properly with a great deal of the wordpress themes; nonetheless, it’s going to fail with some themes. Within the form of situation, that you’d be succesful to say any of the subsequent tons of scroll wordpress plugins.
1. Ajax Load Extra
Equal to Recall tons of Scroll, Ajax Load Extra plugin furthermore permits you in order that it is good so as to add endless scroll and clickable Load Extra buttons to your wordpress residing.
The plugin gives further customization alternate selections, alongside facet many web page loading icon types, button types, and so forth. For an extensive handbook, that you’d be succesful to look at our tutorial on creating a load further posts button in wordpress the say of Ajax Load Extra plugin.
Then however once more, the plugin has some finding out curve for inexperienced persons. It has an developed interface with many alternate selections, alongside facet repeater template, shortcode builder, wordpress queries, and additional.
It will require some coding abilities to note tons of scrolling with this plugin.
2. YITH Plenty of Scrolling
YITH Plenty of Scrolling is an easy totally different to Ajax Load Extra or Recall Plenty of Scroll plugin.
Equal to Recall Plenty of Scroll plugin, it has minimal alternate selections to residing up scroll-essentially primarily primarily based internet interaction in your residing. You factual wish to put in and set off the plugin and allow tons of scrolling.
Any individual, alongside facet inexperienced persons, can with out dispute residing up tons of scrolling the say of this plugin. Then however once more, it doesn’t grasp a ‘Load Extra’ button chance, which is built-in inside the diversified two plugins talked about above.
We hope this textual content helped you to go looking out out straightforward strategies in order that it is good so as to add tons of scroll create to your wordpress weblog. It’s most likely you may perhaps presumably furthermore need to peek our handbook on straightforward strategies in order that it is good so as to add scroll-depth monitoring in wordpress.
Should you cherished this textual content, then please subscribe to our YouTube Channel for wordpress video tutorials. You presumably can furthermore come by us on Twitter and Fb.
The submit The type to Add Plenty of Scroll to your wordpress Place of residing (Step by Step) regarded first on WPBeginner.
from WordPress https://ift.tt/2MStReL via IFTTT
0 notes
Text
How to get developers to implement SEO recommendations
The hardest problem in doing SEO isn’t the algorithm updates. It isn’t having access to the enterprise tools. It’s not even whether or not you have the experience to determine where to focus your efforts.
No, the hardest problem in SEO is getting developers to actually execute recommendations.
We all walk into projects hoping to discover an internal champion that can take the developers to lunch and buy them beers in hopes that our suggestions get turned into actions, but sometimes that champion doesn’t show up. In some cases, getting things done may require social engineering. In other cases, it just requires a degree in engineering.
Let’s talk about how you can be better prepared to get developers to act on your recommendations and drive some results.
The Anderson-Alderson scale of developers
First, let’s meet the players.
I like to think there are two opposite extremes in web developers, and I use two of my favorite characters to personify them. One is Thomas Anderson, whom you may remember from “The Matrix” before he became Neo.
Here’s how his boss describes him in the film: “You have a problem, Mr. Anderson. You think that you’re special. You believe that somehow the rules do not apply to you.”
Anderson developers are the type of employees who live on their own terms and only do things when they feel like it. They’re the mavericks that will argue with you on the merits of code style guides, why they left meta tags out of their custom-built CMS entirely, and why they will never implement AMP — meanwhile, not a single line of their code validates against the specifications they hold dear.
They’re also the developers who roll their eyes to your recommendations or talk about how they know all of the “SEO optimizations” you’re presenting, they just haven’t had the time to do them. Sure thing, Mr. Anderssssson.
On the other end of the spectrum you have Elliot Alderson.
For those of you who don’t watch “Mr. Robot,” Alderson is the type of person who will come into the office at 2:00 a.m. to fix things when they break, even going as far as to hop on the company jet that same night to dig into the network’s latest meltdown.
Alderson-type developers are itching to implement your recommendations right away. That’s not because they necessarily care about ranking, but because they care about being good at what they do.
This developer type is attentive and will call you out on your b.s. if you don’t know what you’re talking about. So don’t come in with recommendations about asynchronous JavaScript without understanding how it works.
Aldersons will also help you brainstorm the actual execution of a strong idea and how to help you get your recommendations prioritized in the black hole that is the dev queue. They’re likely to be aware of Google’s documentation, but recognize that it may not always be up to date and respect your experience, so they’ll ask for your thoughts before implementing something they’re unsure of.
My greatest experience with a developer on the Alderson end of the scale was on a client project for a television show. We’d flown out to LA to meet with the team and walk them through our SEO Site Audit.
While we were explaining some of those recommendations, the developer was sitting there in the room, not taking notes. Rather, this gentleman was committing code as we were explaining what needed to be fixed. By the time the meeting was over, all of our high-value recommendations had been implemented.
Unfortunately, I don’t remember that guy’s name — but he is a legend.
Strategic vs. execution deliverables
Throughout the course of my career, deliverables from many agencies have come across my desk, and I’m always struck by the way some companies present their recommendations.
Many deliverables are either just screen shots of Google tools or prescriptions with little to no context. I always like to imagine a CEO having someone print out our work to read while they are riding in a limo to the airport. I want that person to feel that they understand what we’re suggesting, why it’s important, and how we’re doing a great job.
Additionally, I often find that there is no supplemental document to the strategy document that helps the client and its development team actually execute on these recommendations. It’s very much presented as, “Here’s a problem, you should fix it. Good luck.”
For example, when we deliver an SEO Site Audit, each set of problems is presented with context as to why it matters, an illustration of the issue and a series of recommendations, both with screenshots and code snippets. Each set is then prioritized with a score of benefit, ease and readiness of implementation.
All of the issues are coded with a number so that they can be represented in a spreadsheet. In that spreadsheet, there is a tab for each coded issue that highlights the specific URLs where that issue is happening, as well as any corresponding data that represents that issue.
As an example, for a list of meta descriptions that are too long, we will include the those URLs, their meta descriptions and their length.
The bigger issue lies in deliverables that are presented more for the client’s review and approval than for implementation by developers. We have a deliverable called “Content Recommendations,” wherein we take a client’s content and place it into a model in Word and track changes to update the body copy, metadata and internal linking structure.
This is great for a marketer to review what we’re doing to their copy and make sure that we continue to maintain the voice and tone. It’s also great if the client has a marketing coordinator who will be doing the manual implementation.
It’s horrible from a development standpoint, in that it requires them to do a very tedious job of going page by page to copy and paste new items, and no developer wants to do that.
That means implementation of the recommendations in that Word document requires a developer who’s high on the Alderson side of the Anderson-Alderson scale.
On the other hand, if we review the client-facing version of the Content Recommendations document with the client and then place the resulting changes into a spreadsheet, a developer could write a script that goes through every page and makes the changes we’re suggesting. More on that later.
This would place the implementation closer to the Anderson end of the Anderson-Alderson scale.
Getting developers to do things is all about scale
Generally speaking, scale is always something to be mindful of with SEO recommendations. Sometimes, though, there is no way to scale what you’re trying to accomplish.
For instance, if you’ve migrated a site and changed its taxonomy in such a way that there is no definitive pattern, you cannot write rule-based .htaccess entries for its redirects.
Developers have a series of tools on their end that enforce changes and/or make things scale. It is our job to make our recommendations through this frame to get devs to actually implement them. Otherwise, the dev team will always find a way to push back.
Common SEO implementation tasks on the Anderson-Alderson scale
Certain SEO-specific tasks require more dev effort than others and rate differently on the Anderson-Alderson scale, where placement on that scale indicates what type of developer you need to be working with in order for those recommendations to be implemented. The following illustrates where these common tasks generally fall on that scale.
Updating metadata. This process is typically quite tedious. Unless the copy is prepared where it’s easily extracted and placed into the page, it would require page-by-page updates in the CMS, or pulling from the document we prepare and placing it into the database.
Updating body copy or embedded structured data. Similar to updating metadata, this is also quite tedious and requires page-by-page updates. In cases where we’re talking about updating schema.org code that’s integrated within the content rather than placed in the using JSON-LD, this is a nightmare for a developer to implement.
Updating internal linking structure. This could potentially be done programmatically, but only if the the relationships are effectively identified. In most cases, SEOs present the recommendation on a page-by-page level, and a developer cannot effectively scale that effort.
Optimizing code for performance. Developers tend to be obsessed with speed, so much so that they shorten the word “performance” to “perf” so it can be said faster. However, they have an aversion to the critical rendering path recommendations that come out of the Google’s PageSpeed Insights. Of the SEO recommendations I make, these are the ones I tread the most lightly with because it’s an area in which developers are often defensive.Pro Tip: Use the DevTools Timeline and Network performance detail to get them on board with page speed optimizations. They tend to react better to those.
Generating XML sitemaps following site taxonomy. There are many tools that support the development of XML sitemaps, but developers tend to just let those rip. This leads to XML sitemaps like “sitemap14.xml” rather than those that reflect meaningful segmentation following the site taxonomy and are therefore useful to SEOs for managing indexation.
Generating HTML snapshots. Some JavaScript Single Page App frameworks such as Angular 1.x have historically had difficulty getting indexed. But developers have heard that Google is crawling using headless browsers, and they know that Angular is a framework developed by Google, so they sometimes are not compelled to account for its shortcomings.
Implementing redirects. Redirects can be scaled pretty easily, as they are often done on the server configuration level and written through a series of pattern-matched rules. It’s extremely rare (in my experience) that a developer will not follow through on these.
Fixing improper redirects. Conversely, when it comes to switching redirects from 302s to 301s, I have seen pushback from development teams. In fact, I was once told that the switch might break the site.
Clearly, we need to seek out better developers to work with, or we have to find a way to make our recommendations Anderson-proof.
Allow me to introduce you to task runners
Web development, primarily on the front end, gets more and more complicated with each passing day. One of the more valuable concepts that have been introduced in the past five years is task runners.
Task runners such as Gulp and Grunt allow developers to automate a series of tasks every time they push new code. A more recent addition, Webpack, also features task-running capability. This is largely to keep developers out of doing mundane or tedious processes the machine itself can do, and many web projects are leveraging these for that purpose.
Without going into the specifics of the tools themselves, communities have grown around Grunt, Gulp and Webpack; as a result, a series of plugins is available. Of course, custom modules can be written for each, but the less work you create for developers, the better.
Going back to the idea of updating metadata at scale, there is a plugin for Grunt called , which allows you to provide an XLSX file with changes to page titles, meta descriptions and open graph metadata.
Simply offering the file, developing a column mapping and running the task would then update all of the relevant pages on your site. Sure, what I’m suggesting applies more to flat files than content in a CMS, but of course there are task runners that run on the database level as well.
A developer could effectively modify this plugin to edit the database rather than editing files, or your Excel file could be converted to an SQL file quite quickly and run as an UPDATE across the database.
Finally, most modern content management systems have plugins or modules that allow developers to scale tedious tasks to similar effect. It’s up to you to do the research and know about them when preparing your recommendations.
Common SEO recommendations you can use task runners for
Grunt, Gulp and Webpack all have a series of plugins offering configurable functionality that allows a developer to quickly execute tedious SEO tasks. The following is a (non-exhaustive) list of SEO tasks and some plugins that can be used for them:
Code minification
Image compression
Automatic updates to XML sitemaps
AMP validation
AMP creation
Updating meta tags
Generating HTML snapshots
Page speed insights
Each of these plugins will allow you to prepare a specification (and, in some cases, support files). Then the developer simply has to configure the plugin to reflect it and run the tasks. You’ve effectively made their job quite easy by leveraging these tools.
Outside of the Grunt, Gulp, Webpack setups, a dev could use Webcheck to automate a series of other checks for several other SEO issues as highlighted in this StackOverflow thread. The idea is that the developer could write build tests that wouldn’t allow them to deploy the new site unless everything checked out. You can find more plugins by searching the npmjs.com.
Other ways to get developers to implement SEO recommendations
Task runners are certainly not an be-all-end-all; rather, they are another tool in the SEO’s toolbox for interfacing with developers effectively. There are many smaller touches that can help you get the development team to take action.
Understand the tech stack, and frame your recommendations within it. Consider a scenario where you’ve suggested 301 redirects for your client. It turns out they are running Nginx instead of Apache. Do you know that Nginx does not employ an .htaccess file? If you don’t, you may suggest placing the 301 redirects there, and the developer may ignore everything else that you’re saying. Tools like BuiltWith.com will give you a general determination of what technologies are in use. A better idea is to look at the HTTP headers in Chrome DevTools.No matter what you do, you should spend the time to get a detailed understanding of the tech stack when your engagement begins.
Give granular detail in your recommendations. If it requires the developer to look elsewhere outside of your document for the solution, you are far less likely to get them to implement the recommendation. Instead, explain the context and implementation in line within your deliverable rather than linking out to other people’s explanations. Although developers tend to never trust other people’s apps, some developers tend to respect your findings from DevTools more than many SEO tools. My guess is that this is due to the combination of granular detail and it being the tool they use every day.
Give one solution, but know the other ones. Often an SEO issue can be solved a number of ways, and it can be hard to fight the desire to fill up your SEO documents by exhaustively highlighting all available options. Fight harder and only deliver one possible solution. Eliminating the need to make a decision will lead to developers being more likely to implement. However, if the development team shoots that one solution down, have another solution ready. For instance, if they can’t move the site from subdomains to subdirectories, then suggest a reverse proxy.
Business cases and prioritization. This is perhaps the most valuable thing you can do to get buy-in up and down the organization, which leads to added pressure on the development team to get things done. Applying a dollar figure to the value of your implementations makes the idea of action more compelling. Prioritizing recommendations through this lens helps as well. Granted, we all know no one can truly predict the size of an opportunity, so do it with some sort of understandable methodology so you can get things to happen.
Understand their development methodology. Whether it’s agile, waterfall, XP, some combination, or some new thing that only one team in the world does, look to understand it. Listen, I can’t stand when someone runs up on me at my desk while I’m in deep concentration to ask me a question they could have Googled. Similarly, developers hate when SEOs come to them and tell them they need to disrupt how they normally operate to accommodate an SEO recommendation. So if that team works in sprints, find out from their Scrum master when the sprint cycle ends and when the best time is to get your requirements into a subsequent sprint. You should also be working directly with this person to develop the recommendations into stories to place into their project management solution so the team can adhere to their standard workflow rather than needing to translate your work into how they operate.
Develop a relationship with the development team. It seems obvious, but the soft skill of becoming friends with the development team will go a long way in their being more likely to work with you. In most cases, the relationship between SEO and technical teams is very transactional, so they only hear from you when you want something. Instead, if you take the time to have a genuine interest in these people, you’ll find they are just people trying to do the best they can, like you and me.
Appeal to their self-interests. To the previous point, there are opportunities to align what you’re trying to do with what they are trying to do. For example, a recent client of ours had a development team looking to optimize page speed, but they were looking more closely at an internal metric rather than the external ones that Google is looking at. It was far easier getting buy-in on that subset of recommendations than any of the other ones because it supported the mandate that the person had been given by his bosses. So it was more valuable for me to focus in on that when speaking to him than on things like redirects. While that required some reprioritization of what I believed to be the most valuable tasks, it did help shift the focus on the page speed effort a bit to ensure that the items that I highlighted got prioritized. You lose some, you win some — as long as the outcome is income!
Do what you can to balance the scale
As a developer, I can tell you that even if you were to become one, it will always be difficult to get development teams to make things happen. However, when you speak their language and take more interest in bringing them the right detail-oriented solution, you will get a lot farther than those that do not.
Improving your deliverables, leveraging task runners, developing business cases, prioritizing effectively and taking a genuine interest in who you’re dealing with will get you much closer to complete implementation and better organic search performance. Best of luck converting your Andersons into Aldersons!
Some opinions expressed in this article may be those of a guest author and not necessarily Search Engine Land. Staff authors are listed here.
Source
http://searchengineland.com/get-developers-implement-seo-recommendations-280318
0 notes
Text
JavaScript & SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
JavaScript &amp; SEO: Making Your Bot Experience As Good As Your User Experience
Posted by alexis-sanders
Understanding JavaScript and its potential impact on search performance is a core skillset of the modern SEO professional. If search engines can’t crawl a site or can’t parse and understand the content, nothing is going to get indexed and the site is not going to rank.
The most important questions for an SEO relating to JavaScript: Can search engines see the content and grasp the website experience? If not, what solutions can be leveraged to fix this?
FundamentalsWhat is JavaScript?
When creating a modern web page, there are three major components:
HTML – Hypertext Markup Language serves as the backbone, or organizer of content, on a site. It is the structure of the website (e.g. headings, paragraphs, list elements, etc.) and defining static content.
CSS – Cascading Style Sheets are the design, glitz, glam, and style added to a website. It makes up the presentation layer of the page.
JavaScript – JavaScript is the interactivity and a core component of the dynamic web.
Learn more about webpage development and how to code basic JavaScript.
Image sources: 1, 2, 3
JavaScript is either placed in the HTML document within <script> tags (i.e., it is embedded in the HTML) or linked/referenced. There are currently a plethora of JavaScript libraries and frameworks, including jQuery, AngularJS, ReactJS, EmberJS, etc.
JavaScript libraries and frameworks:
What is AJAX?
AJAX, or Asynchronous JavaScript and XML, is a set of web development techniques combining JavaScript and XML that allows web applications to communicate with a server in the background without interfering with the current page. Asynchronous means that other functions or lines of code can run while the async script is running. XML used to be the primary language to pass data; however, the term AJAX is used for all types of data transfers (including JSON; I guess "AJAJ" doesn’t sound as clean as "AJAX" [pun intended]).
A common use of AJAX is to update the content or layout of a webpage without initiating a full page refresh. Normally, when a page loads, all the assets on the page must be requested and fetched from the server and then rendered on the page. However, with AJAX, only the assets that differ between pages need to be loaded, which improves the user experience as they do not have to refresh the entire page.
One can think of AJAX as mini server calls. A good example of AJAX in action is Google Maps. The page updates without a full page reload (i.e., mini server calls are being used to load content as the user navigates).
Image source
What is the Document Object Model (DOM)?
As an SEO professional, you need to understand what the DOM is, because it’s what Google is using to analyze and understand webpages.
The DOM is what you see when you “Inspect Element” in a browser. Simply put, you can think of the DOM as the steps the browser takes after receiving the HTML document to render the page.
The first thing the browser receives is the HTML document. After that, it will start parsing the content within this document and fetch additional resources, such as images, CSS, and JavaScript files.
The DOM is what forms from this parsing of information and resources. One can think of it as a structured, organized version of the webpage’s code.
Nowadays the DOM is often very different from the initial HTML document, due to what’s collectively called dynamic HTML. Dynamic HTML is the ability for a page to change its content depending on user input, environmental conditions (e.g. time of day), and other variables, leveraging HTML, CSS, and JavaScript.
Simple example with a <title> tag that is populated through JavaScript:
HTML source
DOM
What is headless browsing?
Headless browsing is simply the action of fetching webpages without the user interface. It is important to understand because Google, and now Baidu, leverage headless browsing to gain a better understanding of the user’s experience and the content of webpages.
PhantomJS and Zombie.js are scripted headless browsers, typically used for automating web interaction for testing purposes, and rendering static HTML snapshots for initial requests (pre-rendering).
Why can JavaScript be challenging for SEO? (and how to fix issues)
There are three (3) primary reasons to be concerned about JavaScript on your site:
Crawlability: Bots’ ability to crawl your site.
Obtainability: Bots’ ability to access information and parse your content.
Perceived site latency: AKA the Critical Rendering Path.
Crawlability
Are bots able to find URLs and understand your site’s architecture? There are two important elements here:
Blocking search engines from your JavaScript (even accidentally).
Proper internal linking, not leveraging JavaScript events as a replacement for HTML tags.
Why is blocking JavaScript such a big deal?
If search engines are blocked from crawling JavaScript, they will not be receiving your site’s full experience. This means search engines are not seeing what the end user is seeing. This can reduce your site’s appeal to search engines and could eventually be considered cloaking (if the intent is indeed malicious).
Fetch as Google and TechnicalSEO.com’s robots.txt and Fetch and Render testing tools can help to identify resources that Googlebot is blocked from.
The easiest way to solve this problem is through providing search engines access to the resources they need to understand your user experience.
!!! Important note: Work with your development team to determine which files should and should not be accessible to search engines.
Internal linking
Internal linking should be implemented with regular anchor tags within the HTML or the DOM (using an HTML tag) versus leveraging JavaScript functions to allow the user to traverse the site.
Essentially: Don’t use JavaScript’s onclick events as a replacement for internal linking. While end URLs might be found and crawled (through strings in JavaScript code or XML sitemaps), they won’t be associated with the global navigation of the site.
Internal linking is a strong signal to search engines regarding the site’s architecture and importance of pages. In fact, internal links are so strong that they can (in certain situations) override “SEO hints” such as canonical tags.
URL structure
Historically, JavaScript-based websites (aka “AJAX sites”) were using fragment identifiers (#) within URLs.
Not recommended:
The Lone Hash (#) – The lone pound symbol is not crawlable. It is used to identify anchor link (aka jump links). These are the links that allow one to jump to a piece of content on a page. Anything after the lone hash portion of the URL is never sent to the server and will cause the page to automatically scroll to the first element with a matching ID (or the first <a> element with a name of the following information). Google recommends avoiding the use of "#" in URLs.
Hashbang (#!) (and escaped_fragments URLs) – Hashbang URLs were a hack to support crawlers (Google wants to avoid now and only Bing supports). Many a moon ago, Google and Bing developed a complicated AJAX solution, whereby a pretty (#!) URL with the UX co-existed with an equivalent escaped_fragment HTML-based experience for bots. Google has since backtracked on this recommendation, preferring to receive the exact user experience. In escaped fragments, there are two experiences here:
Original Experience (aka Pretty URL): This URL must either have a #! (hashbang) within the URL to indicate that there is an escaped fragment or a meta element indicating that an escaped fragment exists (<meta name="fragment" content="!">).
Escaped Fragment (aka Ugly URL, HTML snapshot): This URL replace the hashbang (#!) with “_escaped_fragment_” and serves the HTML snapshot. It is called the ugly URL because it’s long and looks like (and for all intents and purposes is) a hack.
Image source
Recommended:
pushState History API – PushState is navigation-based and part of the History API (think: your web browsing history). Essentially, pushState updates the URL in the address bar and only what needs to change on the page is updated. It allows JS sites to leverage “clean” URLs. PushState is currently supported by Google, when supporting browser navigation for client-side or hybrid rendering.
A good use of pushState is for infinite scroll (i.e., as the user hits new parts of the page the URL will update). Ideally, if the user refreshes the page, the experience will land them in the exact same spot. However, they do not need to refresh the page, as the content updates as they scroll down, while the URL is updated in the address bar.
Example: A good example of a search engine-friendly infinite scroll implementation, created by Google’s John Mueller (go figure), can be found here. He technically leverages the replaceState(), which doesn’t include the same back button functionality as pushState.
Read more: Mozilla PushState History API Documents
Obtainability
Search engines have been shown to employ headless browsing to render the DOM to gain a better understanding of the user’s experience and the content on page. That is to say, Google can process some JavaScript and uses the DOM (instead of the HTML document).
At the same time, there are situations where search engines struggle to comprehend JavaScript. Nobody wants a Hulu situation to happen to their site or a client’s site. It is crucial to understand how bots are interacting with your onsite content. When you aren’t sure, test.
Assuming we’re talking about a search engine bot that executes JavaScript, there are a few important elements for search engines to be able to obtain content:
If the user must interact for something to fire, search engines probably aren’t seeing it.
Google is a lazy user. It doesn’t click, it doesn’t scroll, and it doesn’t log in. If the full UX demands action from the user, special precautions should be taken to ensure that bots are receiving an equivalent experience.
If the JavaScript occurs after the JavaScript load event fires plus ~5-seconds*, search engines may not be seeing it.
*John Mueller mentioned that there is no specific timeout value; however, sites should aim to load within five seconds.
*Screaming Frog tests show a correlation to five seconds to render content.
*The load event plus five seconds is what Google’s PageSpeed Insights, Mobile Friendliness Tool, and Fetch as Google use; check out Max Prin’s test timer.
If there are errors within the JavaScript, both browsers and search engines won’t be able to go through and potentially miss sections of pages if the entire code is not executed.
How to make sure Google and other search engines can get your content1. TEST
The most popular solution to resolving JavaScript is probably not resolving anything (grab a coffee and let Google work its algorithmic brilliance). Providing Google with the same experience as searchers is Google’s preferred scenario.
Google first announced being able to “better understand the web (i.e., JavaScript)” in May 2014. Industry experts suggested that Google could crawl JavaScript way before this announcement. The iPullRank team offered two great pieces on this in 2011: Googlebot is Chrome and How smart are Googlebots? (thank you, Josh and Mike). Adam Audette’s Google can crawl JavaScript and leverages the DOM in 2015 confirmed. Therefore, if you can see your content in the DOM, chances are your content is being parsed by Google.
Recently, Barry Goralewicz performed a cool experiment testing a combination of various JavaScript libraries and frameworks to determine how Google interacts with the pages (e.g., are they indexing URL/content? How does GSC interact? Etc.). It ultimately showed that Google is able to interact with many forms of JavaScript and highlighted certain frameworks as perhaps more challenging. John Mueller even started a JavaScript search group (from what I’ve read, it’s fairly therapeutic).
All of these studies are amazing and help SEOs understand when to be concerned and take a proactive role. However, before you determine that sitting back is the right solution for your site, I recommend being actively cautious by experimenting with small section Think: Jim Collin’s “bullets, then cannonballs” philosophy from his book Great by Choice:
“A bullet is an empirical test aimed at learning what works and meets three criteria: a bullet must be low-cost, low-risk, and low-distraction… 10Xers use bullets to empirically validate what will actually work. Based on that empirical validation, they then concentrate their resources to fire a cannonball, enabling large returns from concentrated bets.”
Consider testing and reviewing through the following:
Confirm that your content is appearing within the DOM.
Test a subset of pages to see if Google can index content.
Manually check quotes from your content.
Fetch with Google and see if content appears.
Fetch with Google supposedly occurs around the load event or before timeout. It's a great test to check to see if Google will be able to see your content and whether or not you’re blocking JavaScript in your robots.txt. Although Fetch with Google is not foolproof, it’s a good starting point.
Note: If you aren’t verified in GSC, try Technicalseo.com’s Fetch and Render As Any Bot Tool.
After you’ve tested all this, what if something's not working and search engines and bots are struggling to index and obtain your content? Perhaps you’re concerned about alternative search engines (DuckDuckGo, Facebook, LinkedIn, etc.), or maybe you’re leveraging meta information that needs to be parsed by other bots, such as Twitter summary cards or Facebook Open Graph tags. If any of this is identified in testing or presents itself as a concern, an HTML snapshot may be the only decision.
2. HTML SNAPSHOTSWhat are HTmL snapshots?
HTML snapshots are a fully rendered page (as one might see in the DOM) that can be returned to search engine bots (think: a static HTML version of the DOM).
Google introduced HTML snapshots 2009, deprecated (but still supported) them in 2015, and awkwardly mentioned them as an element to “avoid” in late 2016. HTML snapshots are a contentious topic with Google. However, they're important to understand, because in certain situations they're necessary.
If search engines (or sites like Facebook) cannot grasp your JavaScript, it’s better to return an HTML snapshot than not to have your content indexed and understood at all. Ideally, your site would leverage some form of user-agent detection on the server side and return the HTML snapshot to the bot.
At the same time, one must recognize that Google wants the same experience as the user (i.e., only provide Google with an HTML snapshot if the tests are dire and the JavaScript search group cannot provide support for your situation).
Considerations
When considering HTML snapshots, you must consider that Google has deprecated this AJAX recommendation. Although Google technically still supports it, Google recommends avoiding it. Yes, Google changed its mind and now want to receive the same experience as the user. This direction makes sense, as it allows the bot to receive an experience more true to the user experience.
A second consideration factor relates to the risk of cloaking. If the HTML snapshots are found to not represent the experience on the page, it’s considered a cloaking risk. Straight from the source:
“The HTML snapshot must contain the same content as the end user would see in a browser. If this is not the case, it may be considered cloaking.” – Google Developer AJAX Crawling FAQs
Benefits
Despite the considerations, HTML snapshots have powerful advantages:
Knowledge that search engines and crawlers will be able to understand the experience.
Certain types of JavaScript may be harder for Google to grasp (cough... Angular (also colloquially referred to as AngularJS 2) …cough).
Other search engines and crawlers (think: Bing, Facebook) will be able to understand the experience.
Bing, among other search engines, has not stated that it can crawl and index JavaScript. HTML snapshots may be the only solution for a JavaScript-heavy site. As always, test to make sure that this is the case before diving in.
Site latency
When browsers receive an HTML document and create the DOM (although there is some level of pre-scanning), most resources are loaded as they appear within the HTML document. This means that if you have a huge file toward the top of your HTML document, a browser will load that immense file first.
The concept of Google’s critical rendering path is to load what the user needs as soon as possible, which can be translated to → "get everything above-the-fold in front of the user, ASAP."
Critical Rendering Path - Optimized Rendering Loads Progressively ASAP:
Image source
However, if you have unnecessary resources or JavaScript files clogging up the page’s ability to load, you get “render-blocking JavaScript.” Meaning: your JavaScript is blocking the page’s potential to appear as if it’s loading faster (also called: perceived latency).
Render-blocking JavaScript – Solutions
If you analyze your page speed results (through tools like Page Speed Insights Tool, WebPageTest.org, CatchPoint, etc.) and determine that there is a render-blocking JavaScript issue, here are three potential solutions:
Inline: Add the JavaScript in the HTML document.
Async: Make JavaScript asynchronous (i.e., add “async” attribute to HTML tag).
Defer: By placing JavaScript lower within the HTML.
!!! Important note: It's important to understand that scripts must be arranged in order of precedence. Scripts that are used to load the above-the-fold content must be prioritized and should not be deferred. Also, any script that references another file can only be used after the referenced file has loaded. Make sure to work closely with your development team to confirm that there are no interruptions to the user’s experience.
Read more: Google Developer’s Speed Documentation
TL;DR - Moral of the story
Crawlers and search engines will do their best to crawl, execute, and interpret your JavaScript, but it is not guaranteed. Make sure your content is crawlable, obtainable, and isn’t developing site latency obstructions. The key = every situation demands testing. Based on the results, evaluate potential solutions.
http://ift.tt/2sKkiUf
0 notes
Text
Today’s the Last Day to Get a Great Deal on Your New StudioPress Site
Heads up, today is the last day to get your first month free, plus no-charge migration of your existing WordPress site to a brand-new, easy-to-use StudioPress Site.
You’ve got until 5:00 p.m. Pacific Time today, April 28, 2017 to get the deal. Simply click this link and the incentives will be applied at checkout.
I’ve included the original post below for more information if you missed it. See you on the other side!
_____________________________
It’s been less than three months since we launched StudioPress Sites, our new solution that combines the ease of an all-in-one website builder with the flexible power of WordPress.
The response and feedback have been phenomenal. And the icing on the cake is that we’re already winning accolades.
In an independent speed test performed this month by WebMatros, StudioPress Sites was declared the undisputed winner. We’re thrilled, because we were up against formidable competition from WP Engine, Flywheel, Media Temple, Pressable, and Bluehost.
As you know, speed is important. If a page takes more than a couple of seconds to load, users will instantly hit the back button and move on.
But that’s only part of the story. Because unlike those other hosts, with StudioPress Sites you just sign up and quickly set up, without the usual hassles of self-hosted WordPress.
WordPress made fast and easy
The primary difference between a website builder and self-hosted WordPress is that with the former, you’re dealing with software as a service (SaaS), while the latter is … well, hosting. Not only is self-hosted WordPress a pain to deal with, it can also lead to unexpected surprises if you actually succeed (like your site crashing).
In this sense, StudioPress Sites is more like SaaS than hosting. You can set up your new site in just minutes on our server infrastructure that’s specifically optimized (and now independently tested) for peak WordPress performance.
From there, you simply select from 20 mobile-optimized HTML5 designs. Then, you choose from a library of trusted plugins for the functionality you need — and install them with one click.
Next, you put the included SEO tools to work, like our patented content analysis and optimization software, keyword research, advanced schema control, XML sitemap generation, robots.txt generation, asynchronous JavaScript loading, enhanced Open Graph output, breadcrumb title control, and AMP support.
There’s even more to StudioPress Sites than what I’ve highlighted here, but you can check out all the features at StudioPress.com. Let’s talk about the deal.
First month free, plus free migration
It’s really that simple. When you sign up for StudioPress Sites before 5:00 p.m. Pacific Time on April 28, 2017, you pay nothing for your first month.
On top of that, we’ll move you from your current WordPress site to your brand-new, easy-to-use, and blazingly fast StudioPress Site at no charge.
Why?
Because we know that moving your website can be a pain, even if you’re not happy with your current host. And just as importantly, because we want you to try StudioPress Sites risk free.
Fair enough?
Cool — head over to StudioPress to check it all out and sign up today.
NOTE: You must use that ^^^^ special link to get the deal!
The post Today’s the Last Day to Get a Great Deal on Your New StudioPress Site appeared first on Copyblogger.
via marketing http://ift.tt/2pbrkPR
0 notes