Project Information:

scraparazzie is a Python package for searching specific topic or keywords of Google News feeds. It had been spending for around 4 days to deal with scraparazzie, a python package for searching Google News feeds. This package is based on gnewsclient as framework to modify and enhance features.

Features that gnewsclient have:

  1. Available to set language, location, topic and number of output items for searching;
  2. Available to show image and video (if there is) link of the news

Features that scraparazzie have:

  1. Available to show publish datetime;
  2. Except topic, keywords searching is included (either topic or keyword searching, and keyword searching is the top priority searching);
  3. Take away the feature to show image and video (if there is) link of the news;
  4. Result shows from latest to oldest;
  5. Export as list

The main reason I decided to deal with this project is personal use - I need a something that can help me to collect latest news with title, source, link and publish date at fixed time for further application. Of course I searched tons of tutorial, mostly suggesting to do it through newspaper3k or beautifulsoup. There are some package and code for Google News on github, e.g. google-news-scraper, GoogleNews, google_news, gnp and gnewsclient. However, there are not many features and I can’t find a package that fits all my requirement. gnewsclient is the best one I can find, but still I can’t get publish datetime. At the beginning I wanted to modify directly from the package, and I realised it’s not an easy job; furthermore, I might want to plug some more features. In the end I just left the framework of gnewsclient and reformed the news processing part. With tons of trial & error and precious help from excellent programmers, the package works as I want…and yes, I can temporary take a break from here. Before get things done, there was a post few years ago at gnewsclient made a request to put keyword searching feature but no feedback. It didn’t look difficult and didn’t take much effort to modify the code. After that, I packed the package - my very first Python package EVER in my life (sounds like I should celebrate? As a non-professional coder…😬). It was kind of silly that I realised I should have put the export-as-list feature and debug few tiny error of the Readme.md, so kind of taking me some time to deal with it - and become v1.2.4.

v1.2.4 is the relatively stable version. I also debug some situation for newly installation that may face, there should be less error. If there is question or enquiry, bugs reporting and things, please issue at Github.