I often see requirements that have likely made their way from an original Request for Proposal (RFP) to a CMS Functional Design that reads something like: "authors should have the ability to set page-level meta tags including the Robots meta tag."
This meta tag "tells" search engines how to index content on a given Web page and its links.*
Valid Content values for <META NAME="ROBOTS"> includes:
- INDEX
- NOINDEX
- FOLLOW
- NOFOLLOW
And if absent, the default is INDEX,FOLLOW. See the Web Robots Pages for the details.
*Bots aren't forced to actually listen to these instructions, though I'd say most search engines try to play nice.
Everything's an Attribute Value (Too Simple and Error Prone)
The simplistic way to implement this might be CMS check boxes for each of the above so you'd have (where "[ ]" means checkbox):
- [ ] INDEX
- [ ] NOINDEX
- [ ] FOLLOW
- [ ] NOFOLLOW
But this gives you the ability to make contradictory options like INDEX and NOINDEX. Luckily I haven't seen this, but this is a good example to avoid assumption. We already know a few things:
- The Robots tags are at the page level
- Some options are valid
- We have a good default (no tag at all or "INDEX,FOLLOW" as described above)
- No translation for the values needed (though you might translate the internal authoring fields if needed)
Two Boolean Options (Okay)
Since we know some are exclusive "OR" choices, then you can get instead:
- [ ] INDEX
- [ ] FOLLOW
Minor note: templating code or however you render these values would need to translate an unselected INDEX into NOINDEX.
Don't Do This
Especially with SDL Tridion, I'd prefer the above over the following, which makes schema updates and searches for items tagged with such features harder:
Index? (Don't do this)
- ( ) Yes
- ( ) No
Follow? (Don't do this)
- ( ) Yes
- ( ) No
Practical Outputs
With Index and Follow as two Boolean options, authors have 4 possible outcomes:
- INDEX,FOLLOW (default)
- INDEX,NOFOLLOW
- NOINDEX,FOLLOW
- NOINDEX,NOFOLLOW
Focus on Behavior
Since Web Robots points out INDEX,FOLLOW is assumed as a default, a more business-friendly CMS setup could be:"How should search engines treat this page? Index:"
- (x) Everything (INDEX,FOLLOW) [Selected by Default]
- ( ) Just this page (INDEX,NOFOLLOW)
- ( ) Just links (NOINDEX,FOLLOW)
- ( ) None (NOINDEX,NOFOLLOW)
And if you prefer the two option setup instead, consider using SDL Tridion Experience Manager Page Types, which will let you set multiple default options so authors automatically get this with new pages:
- [x] INDEX
- [x] FOLLOW
For more content modeling practice or to learn more about search engine instructions, look at the X-Robots-Tag and how Google handles it.
No comments:
Post a Comment
Feel free to share your thoughts below.
Some HTML allowed including links such as: <a href="link">link text</a>.