Saturday 1 March 2014

Google and my rss feed site

I wrote previously about my RSS reader - a simple Perl script
which does RSS aggregation and internal web service - very small and
simple, designed to reduce bandwidth over my mobile. It has been
very effective so far - instead of 50-75+MB/d I achieved 30MB for the
*week*.

Its effective because its no frills and supports a zipped payload
page - no graphics, no external references, etc. Wow! Is it good :-)

http://crisp.dyndns-server.com:3000/p.html

I'll make gradual changes and enhancements to it over time - so if
you use it, you may see changes (and feel free to mail me with
ideas or questions. There is a definite fine line between
bloat and functionality.

Anyway - I am staring at the http access log and notice something
very peculiar. This app - by its nature of aggregating news from a few
web sites has attracted the googlebot crawler. Great - why not.

But I notice a request come in with a "Referer" link like this:


http://www.google.com/search?q=Christina+Mari....


Interesting. So, someone clicked from google to get to me? I must be
famous! Anyhow I play that link into the browser, and I can see - there
I am - on page 1 of the results. Neat! Managed to achieve something
most people dream about.

But - isnt this a security hole? If I attract these requests en-masse
and start collecting them, then I am finding out something very
personal from the requestor and their IP address and browser.

Interesting...will have to look more at this to see what is going on.
I have no direct use in such data - but of course, everyone out
there, coming from google is getting a referer link.



Post created by CRiSP v11.0.25a-b6698


1 comment: