title Kazakh Financial Licences (kz-licences) draft_failed
description This bot scrapes information about entities with seven types of financial licence in the Republic of Kazakhstan
current run state not running
last run single run snapshot draft scrape failed on March 19, 2016 17:55
next run n/a
created by dinotash (Tom Curtis)
last reviewed by peter.evans
Bot state update
commented about 8 years ago

A draft run failed

Bot state update
dinotash commented about 8 years ago

The bot was pushed; scheduling a draft run

(no subject)
peter.evans commented about 8 years ago

Hi Tom,
This one has been out of action for a few months but it looks like it's produced a meaningful error now -
This is the error report:
The following record is invalid:
{"licence_jurisdiction":"Kazakhstan","category":"Financial","confidence":"HIGH","company_name":"\"Atyrau-Currency\" Microfinancial organization\" LLP","regulator":"National Bank of Kazakhstan","sample_date":"2016-02-04","company_jurisdiction":"Kazakhstan","jurisdiction_classification":["Microfinancial organisation"],"start_date":"2015-17-11","source_url":"http://www.afn.kz/attachments/135/273/publish273-1099653.xlsx"}
* Property not of expected format: start_date (must be of format yyyy-mm-dd)
Which makes it look as if the start date for that record seems to be in the format yyyy-dd-mm instead of yyyy-mm-dd.
I suspect resolving that should get the bot back up and running but we'll see! If they don't use one format of dates at the data source then sometimes we have to leave that field out.
Thanks & I hope all is well with you
Peter

Bot state update
commented about 8 years ago

A run failed

Bot state update
peter.evans commented about 8 years ago

A failed run was restarted by the moderator

Saved vars cleared
dinotash commented about 8 years ago

Bot state update
peter.evans commented about 8 years ago

A failed run was restarted by the moderator

Saved vars cleared
dinotash commented about 8 years ago

Bot state update
commented over 8 years ago

A run failed

Bot state update
peter.evans commented over 8 years ago

A failed run was restarted by the moderator

Saved vars cleared
dinotash commented over 8 years ago

Bot state update
commented over 8 years ago

A run failed

Bot state update
commented over 8 years ago

A run started

Bot state update
commented over 8 years ago

A snapshot completed; scheduling the first run of the next snapshot

Bot state update
commented over 8 years ago

A run started

Bot state update
commented over 8 years ago

A snapshot completed; scheduling the first run of the next snapshot

Bot state update
peter.evans commented over 8 years ago

A failed run was restarted by the moderator

Bot state update
commented over 8 years ago

A run failed

Bot state update
commented over 8 years ago

A run started

Bot state update
commented almost 9 years ago

A run succeeded; scheduling the next run

Bot state update
peter.evans commented almost 9 years ago

The bot was accepted; starting run to ingest reviewed data

Bot state update
commented almost 9 years ago

A draft run succeeded; sending for final review

Bot state update
peter.evans commented almost 9 years ago

A moderator has approved the draft bot; running a full draft for final review

Bot state update
peter.evans commented almost 9 years ago

A moderator has started reviewing the draft bot

Re: (turbot bot [kz-licences])
dinotash commented almost 9 years ago

Think that latest push should fix it

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Re: (turbot bot [kz-licences])
dinotash commented almost 9 years ago

Let me have a look. I think there are some rows in the spreadsheet where the "name" column was blank. If so, there won't be a name field in the primary data to transform- which would lead to the below error.

(no subject)
peter.evans commented almost 9 years ago

Hey Tom,
Getting a strange error with kz-licences, seems this is causing some records to transform without a "name" value. Does this make sense to you or is it likely something the I should get a coder here to check out?
Thanks,
Peter
error:
---
Traceback (most recent call last):
File "licence.py", line 23, in <module>
if (raw_record['name'] in ["Total Banks", "Weighted average of banking operations in total number of banks"]):
KeyError: 'name'

Bot state update
commented almost 9 years ago

A final draft run failed

Bot state update
peter.evans commented almost 9 years ago

A moderator has approved the draft bot; running a full draft for final review

Bot state update
peter.evans commented almost 9 years ago

A moderator has started reviewing the draft bot

Re: (turbot bot [kz-licences])
dinotash commented almost 9 years ago

Yep :)

(no subject)
peter.evans commented almost 9 years ago

Hi Tom,
Thanks for fixing up this bot - It sounds like the extra info would be good but optional, probably more important to get some of this into OpenC - Would you be happy for me to accept this bot onto the main db & perhaps revisit in the future if you have the time/ inclination?
Best,
Peter

Re: (turbot bot [kz-licences])
dinotash commented almost 9 years ago

Hi Peter
I’ve uploaded a new version, which addresses most but not all of the points below.
The one thing I haven’t done yet is the below. This isn’t because I disagree but because it’s a lot harder than the others. The problem is to do with handling merged cells and multiple header rows - I thought I’d caught all the cases in the sources, but obviously didn’t. So it’s only scraping one of the four columns under the “Licences” header.
> On this sheet (http://www.afn.kz/attachments/193/243/publish243-1076817.xls) it looks as if there is a “Number” field which I think equates to licence_number that is not yet scraped. There’s also a “Date of Granting” which would equate to start_date I think. I like what you’ve done with the jurisdiction_classification in this instance; even though more specific text is available I think just the general type is fine, particularly in the transformed output.
Will get around to fixing this at some point, but it may not be quick. I’ve forgotten how I did it all in the first place, so I have to find a way to fix it without breaking anything else.
Tom

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

(no subject)
peter.evans commented almost 9 years ago

Hi Tom,
Good to see this one working now - Thank you for writing it and sticking with it while the issues with the framework were sorted out.
On this sheet: http://www.afn.kz/attachments/189/243/publish243-1109091.xls We seem to be transforming a couple of rogue records:
{
"licence_jurisdiction": "Kazakhstan",
"category": "Financial",
"confidence": "HIGH",
"company_name": "Total Banks",
"regulator": "National Bank of Kazakhstan",
"licence_number": "38.0",
"sample_date": "2015-06-16",
"company_jurisdiction": "Kazakhstan",
"jurisdiction_classification": [
"Banking operation"
],
"source_url": "http://www.afn.kz/attachments/189/243/publish243-1109091.xls"
}
{
"licence_jurisdiction": "Kazakhstan",
"category": "Financial",
"confidence": "HIGH",
"company_name": "Weighted average of banking operations in total number of banks",
"regulator": "National Bank of Kazakhstan",
"sample_date": "2015-06-16",
"company_jurisdiction": "Kazakhstan",
"jurisdiction_classification": [
"Banking operation"
],
"source_url": "http://www.afn.kz/attachments/189/243/publish243-1109091.xls"
}
This is already a complex bot, but if you wanted to separate out licence_number and start_date from “License: Reg.№, Date of Granting” in the scraper.out for this sheet (http://www.afn.kz/attachments/190/243/publish243-1086213.xls) then that does look possible.
On this sheet (http://www.afn.kz/attachments/193/243/publish243-1076817.xls) it looks as if there is a “Number” field which I think equates to licence_number that is not yet scraped. There’s also a “Date of Granting” which would equate to start_date I think. I like what you’ve done with the jurisdiction_classification in this instance; even though more specific text is available I think just the general type is fine, particularly in the transformed output.
In this xls we could be transforming start_date from the scraper.out “date of granting license” field (http://www.afn.kz/attachments/194/243/publish243-1013856.doc).
It might be best to defer applying any of the changes above, that you agree with, until after I’ve sent over some rich-licence docs/ examples.
Thanks Tom, another great bot - complex & very well implemented.
All the best,
Peter

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

Run succeeded; sending for draft review

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
sebbacon commented almost 9 years ago

A failed draft run was restarted by the moderator

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
sebbacon commented almost 9 years ago

A failed draft run was restarted by the moderator

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
sebbacon commented almost 9 years ago

A failed draft run was restarted by the moderator

(no subject)
peter.evans commented almost 9 years ago

Hi Tom,
Thanks - That looks like an issue for someone on the development team, unfortunately Peter is away this week, he'd normally get this sorted pretty quick. Will see if anyone else has any time.
Thanks,
Peter

Re: (turbot bot [kz-licences])
dinotash commented almost 9 years ago

Hi Peter
There's another issue with this one first, in terms of converting word files. I've posted about it on slack in the bots channel.
Thanks
Tom

(no subject)
peter.evans commented almost 9 years ago

Hi Tom
Thanks for pushing another scraper! This one has a slight issue in the Manifest where I think the licence.py is not being included:
"files": [
"scraper.py"
],
If you'd like to push this fix then I can start reviewing the data.
Thanks,
Peter

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Bot state update
commented almost 9 years ago

A draft run failed

Bot state update
dinotash commented almost 9 years ago

The bot was pushed; scheduling a draft run

Run history

event metadata
single run snapshot draft scrape failed on June 14, 2015 09:44 44 rows in less than a minute
single run snapshot draft scrape failed on June 14, 2015 09:46 84 rows in less than a minute
single run snapshot draft scrape failed on June 14, 2015 09:48 121 rows in less than a minute
single run snapshot draft scrape failed on June 14, 2015 09:53 121 rows in less than a minute
single run snapshot draft scrape failed on June 14, 2015 09:54 121 rows in less than a minute
single run snapshot draft scrape failed on June 14, 2015 10:55 121 rows in about 1 hour
single run snapshot draft scrape failed on June 15, 2015 12:01 121 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 12:39 121 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 13:08 122 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:08 122 rows in 1 minute
single run snapshot draft scrape failed on June 15, 2015 19:24 121 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:26 0 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:28 108 rows in 1 minute
single run snapshot draft scrape failed on June 15, 2015 19:36 121 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:37 122 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:39 122 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:41 108 rows in 1 minute
single run snapshot draft scrape failed on June 15, 2015 19:43 122 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:46 122 rows in less than a minute
single run snapshot draft scrape failed on June 15, 2015 19:50 58 rows in less than a minute
single run snapshot draft scrape failed on June 16, 2015 18:39 122 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 18:45 121 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 18:47 121 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 18:50 108 rows in 1 minute
single run snapshot draft scrape failed on June 16, 2015 19:05 2865 rows in 1 minute
single run snapshot draft scrape failed on June 16, 2015 19:08 86 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 19:11 121 rows in 1 minute
single run snapshot draft scrape failed on June 16, 2015 19:13 152 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 19:16 161 rows in 1 minute
single run snapshot draft scrape failed on June 16, 2015 19:20 0 rows in less than a minute
single run snapshot draft scrape succeeded on June 16, 2015 19:21 40 rows in 1 minute
single run snapshot draft scrape succeeded on June 16, 2015 19:25 161 rows in 2 minutes
single run snapshot draft scrape succeeded on June 28, 2015 17:56 163 rows in 1 minute
single run snapshot final draft scrape failed on July 02, 2015 13:16 161 rows in 3 minutes
single run snapshot draft scrape failed on July 02, 2015 22:01 161 rows in 1 minute
single run snapshot draft scrape succeeded on July 02, 2015 22:06 163 rows in 1 minute
single run snapshot draft scrape succeeded on July 02, 2015 22:08 163 rows in 1 minute
single run snapshot final draft scrape succeeded on July 03, 2015 13:52 164 rows in 1 minute
single run snapshot 1 prescrape scrape succeeded on July 03, 2015 13:53 164 rows in less than a minute
single run snapshot 2 scrape failed on August 03, 2015 13:53 0 rows in less than a minute
single run snapshot 2 scrape interrupted on August 06, 2015 11:57 129 rows in 1 day
single run snapshot 2 scrape succeeded on September 05, 2015 11:58 174 rows in 1 minute
single run snapshot 3 scrape failed on October 06, 2015 11:58 92 rows in 1 day
single run snapshot 3 scrape failed on October 10, 2015 09:36 141 rows in 1 day
single run snapshot 3 scrape errored on February 02, 2016 11:20 0 rows in 1 day
single run snapshot 3 scrape failed on February 05, 2016 10:29 132 rows in 1 day
single run snapshot draft scrape failed on March 19, 2016 17:55 13 rows in less than a minute

Config

{
  "bot_id": "kz-licences",
  "title": "Kazakh Financial Licences",
  "description": "This bot scrapes information about entities with seven types of financial licence in the Republic of Kazakhstan",
  "language": "python",
  "data_type": "primary data",
  "identifying_fields": [
    "name",
    "category"
  ],
  "files": [
    "scraper.py",
    "licence.py"
  ],
  "frequency": "monthly",
  "publisher": {
    "name": "National Bank of Kazakhstan",
    "url": "http://www.afn.kz/?switch=eng&docid=1",
    "terms": "Appears to be copyright, but no page setting out the detailed terms.",
    "terms_url": "http://www.afn.kz/?switch=eng&docid=1"
  },
  "transformers": [
    {
      "file": "licence.py",
      "data_type": "simple-licence",
      "identifying_fields": [
        "company_name",
        "jurisdiction_classification"
      ]
    }
  ]
}