Web Crawling: Software
To use the Web Crawling application, the user must first register with ENEAGRID:
https://gridaccount.enea.it/
During registration, in the field "Why are you requesting an ENEAGRID account?"
You must indicate:
"To use the Web Crawling virtual laboratory".
Tools
BUbiNG is a scalable, fully distributed crawler, currently under development and that supersedes UbiCrawler.
For any information, feel free to email giuseppe.santomauro@enea.it.
BUbiNG can be whitelisted or blacklisted at your preference.
IPs: 90.147.171.[225-232]
Domains: crawler[01-08].portici.enea.it
How to Stop BUbiNG
BUbiNG supports the Robot Exclusion Standard. If you want to exclude your site from being crawled by BUbiNG see The Web Robots Pages. Briefly, you can put into the robots.txt file at the root of the web server you want to exclude from the crawling what follows:User-agent: BUbiNG Disallow: /Presently, BUbiNG honours changes to the robots.txt file (usually every hour), but does not obey to META tags for robot exclusion.
For any information, feel free to email giuseppe.santomauro@enea.it.
BUbiNG can be whitelisted or blacklisted at your preference.
IPs: 90.147.171.[225-232]
Domains: crawler[01-08].portici.enea.it