aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeLines
* Another batch of strange entries.Petter Reinholdtsen2015-01-13-1/+1
|
* Add test script for elasticsearch.Petter Reinholdtsen2015-01-13-0/+105
|
* Reduce noice and do not log passwords.Petter Reinholdtsen2015-01-04-1/+1
|
* Scan a bit more, and stay 100 000 behind the current pointer.Petter Reinholdtsen2015-01-04-2/+2
|
* Make sure errors are reported.Petter Reinholdtsen2015-01-04-1/+0
|
* Make scraper more robust.Petter Reinholdtsen2015-01-04-6/+8
|
* Ny scraper.Petter Reinholdtsen2015-01-04-0/+1
|
* Improve scraper.Petter Reinholdtsen2015-01-04-1/+4
|
* New scraper for Nordreisa kommune.Petter Reinholdtsen2015-01-04-0/+90
|
* Improve message.Petter Reinholdtsen2014-12-29-1/+2
|
* Do not crash on non-existing URLs.Petter Reinholdtsen2014-12-29-1/+3
|
* Accept problematic pages.Petter Reinholdtsen2014-12-28-2/+2
|
* Disable debugging.Petter Reinholdtsen2014-12-28-1/+1
|
* Add meta info.Petter Reinholdtsen2014-12-28-1/+9
|
* Reduce the days scanning backwards.Petter Reinholdtsen2014-12-27-1/+1
|
* Check a larger time period to handle vacations.Petter Reinholdtsen2014-12-27-2/+2
|
* Flush stdout too.Petter Reinholdtsen2014-12-22-0/+2
|
* Improve output.Petter Reinholdtsen2014-12-22-1/+2
|
* More compact output.Petter Reinholdtsen2014-12-22-3/+3
|
* Done reparsing strange entries.Petter Reinholdtsen2014-12-21-1/+1
|
* Allow scraper to use more CPU.Petter Reinholdtsen2014-12-21-1/+1
|
* Update and add meta info.Petter Reinholdtsen2014-12-20-2/+12
|
* Parse 2013 too, and reorder code.Petter Reinholdtsen2014-12-19-2/+5
|
* Avoid scraping more pages than we need to.Petter Reinholdtsen2014-12-19-1/+1
|
* Fix typo in pagination handling.Petter Reinholdtsen2014-12-19-1/+1
|
* Remember charmap handling.Petter Reinholdtsen2014-12-18-1/+1
|
* HTML scraper is working, save its result.Petter Reinholdtsen2014-12-18-3/+3
|
* Add vendor.Petter Reinholdtsen2014-12-18-1/+1
|
* Map doctype.Petter Reinholdtsen2014-12-18-0/+4
|
* Fix typo.Petter Reinholdtsen2014-12-18-2/+2
|
* More typos.Petter Reinholdtsen2014-12-18-2/+2
|
* Typo.Petter Reinholdtsen2014-12-18-1/+1
|
* First version of the HTML parser.Petter Reinholdtsen2014-12-18-2/+143
|
* New scraper.Petter Reinholdtsen2014-12-18-0/+1
|
* New scraper for the University of Tromsø. Not yet complete.Petter Reinholdtsen2014-12-17-0/+95
|
* Add meta-info.Petter Reinholdtsen2014-12-17-1/+12
|
* Get scraper working again.Petter Reinholdtsen2014-12-14-8/+8
|
* Disable missing pages.Petter Reinholdtsen2014-12-14-6/+6
|
* Typo.Petter Reinholdtsen2014-12-14-2/+2
|
* Correct URLs.Petter Reinholdtsen2014-12-14-2/+13
|
* Get script working with local SQlite files.Petter Reinholdtsen2014-12-13-31/+74
|
* Handle unlimited CPU quota.Petter Reinholdtsen2014-12-13-2/+2
|
* Make scraper more robust.Petter Reinholdtsen2014-12-13-6/+8
|
* Mer meta-info.Petter Reinholdtsen2014-12-13-2/+10
|
* Add duration.Petter Reinholdtsen2014-12-10-0/+1
|
* Quiet down URL extracter.Petter Reinholdtsen2014-12-10-2/+2
|
* Well known ordering when reparsing.Petter Reinholdtsen2014-12-10-1/+1
|
* Might as well flush the buffer when touching the database to remove an entry.Petter Reinholdtsen2014-12-10-0/+3
|
* Typo.Petter Reinholdtsen2014-12-10-1/+1
|
* Add meta info.Petter Reinholdtsen2014-12-10-1/+11
|