Sunday 17 August 2014

Story Crawler - Crawling RSS News Feeds And Maintaining a Database

Main Concept:
--> Crawling news (from the Internet).
--> Storing it into the database.
--> Retrieving information from the database to the application.

The application crawls the RSS feeds from the available sources. Stores all the gathered information in a Database. Then, we can retrieve the information on different views from the Database. Information flow is automatic and we do not control what gets in and out from the database as sources provides the input and our application views shows the output.


Features:
  • Crawling RSS feeds from predefined 15 sources 5 each (BBC, CNN, GoogleNews) and more sources can get added too.
  • Once news is synced, application works completely offline you can not only view the details about a story but also the full news story as it stores whole HTML webpages into the database.
  • Admin panel is available in-case you want to see how data is stored in the database or want to delete a news or clear the whole database.
  • SQL Wildcards are supported with search options; "SeachBy Title", "SearchBy Description", "Search in provided date range".
Setting it up:
  1. Download the package from the download section below.
  2. Attach the database to your MS SQL Server.
  3. Open project file and goto "App.config". Provide the correct connection string there or change the name "Dexter" to your SQL Server Instance name -> Save.
  4. Run application -> Click on "Update Resources" button (This will build sources in the database)
    - Requires 1 minute approx depending on your internet speed.
  5. Click on "Sync All" button (This will start crawling RSS feeds from the internet and retrieving and storing information to your database)
    Requires 1-30 minute(s) approx depending on your internet speed.

You are Ready to go!!



Developed In:
-  MS Visual Studio 2010 (Using .Net Framework 4)
-  MS SQL Server 2008 R2 Service Pack 3

Download:
Filename:Story Crawler.rar
FileSize: 1.04MB
Package Contains: MS Visual Studio 2010 Project File and MS SQL Server 2008 Database


Mirror 1:              Click to Download

No comments:

Post a Comment