Login
New Account
Main Index   FAQ

Removal of links

ImageOak is a place for Internet users to exchange links to webpages containing images they think other people should enjoy also. We are an image search engine, but instead of using a computer to find and sort images into different categories we rely on our users to spot the most interesting pictures out there.

If you are the owner of a website and don't want part of it (or all of it) to be linked to from ImageOak, you must indicate this using the robots.txt standard. This is the same file you would use to block search engine robots. Please see http://www.robotstxt.org for a detailed discussion.

To exclude all or part of a website from ImageOak you must place a file named robots.txt at the root of your site. (So if your website is http://www.yoursite.com, the robots.txt file should be located at http://www.yoursite.com/robots.txt.)

Because ImageOak is not an automated webcrawler but is operated by human Internet users (while the Internet reaches far, it still only spans the planet Earth), ImageOak only parses directives placed in the first

User-agent: ImageOak

block of the robots.txt file. Directives in a

User-agent: *

block are not parsed. To exclude your entire site place the lines

User-agent: ImageOak
Disallow: /

in robots.txt. To prevent links to webpages in the directories /foo and /bar use

User-agent: ImageOak
Disallow: /foo/
Disallow: /bar/

The parser supports two additions to the classic robots.txt rules. These are a '$' character to end a name, and a '*' to match any string of characters, including the empty string. Only one '*' is allowed in each Disallow line. Some examples:

User-agent: ImageOak
Disallow: /page1.htm$
Disallow: /recipes/apple*.html$

The first Disallow excludes the URL http://www.yourserver.com/page1.htm. Without the '$' all URLs beginning with http://www.yourserver.com/page1.htm would have been matched (e.g. http://www.yourserver.com/page1.html). The second Disallow excludes all webpages beginning with apple in the directory recipes (applejuice.html, applepie.html, ...). You can also disallow specific image files. If you would like to exclude all .gif files in the directory snapshots put

User-agent: ImageOak
Disallow: /snapshots/*.gif$

in the robots.txt file.

When a user enters a URL at ImageOak the server first retrieves the robots.txt file. If the file exists and the URL matches any of the 'Disallow:' directives for User-agent: ImageOak, then the webpage is not fetched and the user is notified of the reason. Otherwise the page is retrieved. The URL of each image found in the page is then checked to see if it is disallowed, and if this is the case it is skipped. The 'Disallow:' rules to exclude images should be included in the robots.txt file in the root of the webserver hosting the webpage, even if the image is hosted on another server.

If the webpages you want to avoid linking to are already added to the ImageOak directory, you should make a robots.txt file as above. After it is placed on your webserver please write us at

to get the links removed from ImageOak.