0

robots.txt tells search engines not to index our publications

Joachim 11 років тому оновлений 11 років тому 3

http://app.labjam.com/robots.txt  tells search engines not to index our publications even though there are now linked from our public cngl website. Is this intended?

There are publications in Labjam that are not ready to be shown to the world, but are there as part of a review trajectory, they should not be indexed by the bings and the googles.

(1) Sure, we cannot publish what has not yet IP clearance. BTW: just not allowing indexing and not linking to a file publicly may not be enough to say that something wasn't public. URLs like https://app.labjam.com/projects/project51/outfile.pdf are not too difficult to guess, especially if one can find hundreds of examples what our paper URLs look like on the public website. 

(2) There are simple solutions: (a) serve the documents dynamically (checking authentication) rather than as static files - this also solves the issue above ("BTW"), (b) move the public files into a folder that can be indexed, (c) copy the public files to www.cngl.ie where they can be index.

(d) enforce that all publications are submitted to an OpenAccess repository such as DORAS in DCU and only link to the publication entries in such repositories from the public website.