About Robots Files

Each UXi site has a robots.txt file used to give instructions about their site to web robots like Google, Bing, etc.


It works likes this: a robot wants to vist a site URL, say http://www.uxisite.com/welcome/.

Before it does so, it firsts checks for http://www.uxisite.com/robots.txt, and finds:

 

User-agent: *
Disallow: /

 


The "User-agent: *" means this section applies to all robots.


The "Disallow: /" tells the robot that it should not visit any pages on the site.


Important considerations

  1. Any time a site's URL is updated, the process outlined below should be completed. This assures that all search bots have the most up-to-date info on the site.
  2. Robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
  3. the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.


How to Use Robots in UXi

By default, UXi sites are set to keep bots from reading and indexing them.  Whenever a site is taken live, the robots file should be updated.

 Once the site is on it's live domain, click UXi® Settings > Robots.


  1. Highlight the entire content of the robots file and click delete.
  2. Click the Save Robots.

The results will be:

User-agent: * : Applies rules to all robots

Sitemap: http://uxisite.com/sitemap.xml : Sets an exact location for the XML sitemap