Το Internet Archive ανακοίνωσε πως πλέον θα αγνοεί τα αρχεία robots.txt όταν αποθηκεύει ένα αντίγραφο ενός site. Τα αρχεία robots.txt έχουν σαν κύρια λειτουργία την ενημέρωση των μηχανών αναζήτησης, για το ποιά μέρη ενός ιστότοπου θα ταξινομηθούν από αυτές.
“Over time we have observed that the robots.txt files that are
geared toward search engine crawlers do not necessarily serve our
archival purposes,” stated a blog post that the organization published
last week. “Internet Archive’s goal is to create complete ‘snapshots’ of
web pages, including the duplicate content and the large versions of
files.”
Robots.txt files are increasingly being used to remove entire domains from search engines following their transition from a live, accessible site to a parked domain. If a site goes out of business, and is rendered inaccessible in this way, it also becomes unavailable for viewing via the Internet Archive’s Wayback Machine. The organization apparently receives queries about these sites on a daily basis.
Robots.txt files are increasingly being used to remove entire domains from search engines following their transition from a live, accessible site to a parked domain. If a site goes out of business, and is rendered inaccessible in this way, it also becomes unavailable for viewing via the Internet Archive’s Wayback Machine. The organization apparently receives queries about these sites on a daily basis.
Πηγή: Digital Trendshttps://www.adslgr.com/forum/threads/988621-%CE%A4%CE%BF-Internet-Archive-%CE%B8%CE%B1-%CE%B1%CE%B3%CE%BD%CE%BF%CE%B5%CE%AF-%CF%84%CE%B1-robots-txt-%CE%B1%CF%81%CF%87%CE%B5%CE%AF%CE%B1-%CF%83%CF%84%CE%B7%CE%BD-%CE%B1%CF%81%CF%87%CE%B5%CE%B9%CE%BF%CE%B8%CE%AD%CF%84%CE%B7%CF%83%CE%B7-%CF%84%CE%BF%CF%85
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου
Αποποίηση ευθυνών: Το ιστολόγιο δεν παρέχει συμβουλές, προτροπές και καθοδήγηση.
Εισέρχεστε & εξέρχεστε με δική σας ευθύνη :)