When a bot is pushed to Turbot a first-draft run is performed, validated, and sent for review. This initial draft run is curtailed at 2000 records because if there are any issues that need attention prior to loading the data into OpenCorporates it minimises the load placed on the data source's servers.
At this stage we are mostly checking the data produced by the bot against the data source in order to make sure that we have represented it as accurately as possible. We will also be checking the format of the Transformer as this determines what data will be visible in OpenCorporates.
Once any potential issues have been discussed and resolved the bot is sent on for a complete run and a final review.
At this stage we have a complete dataset to work with, which could expose further text-processing issues, but is mainly used to assess the completeness of the dataset and expose any gaps in date ranges, alpha-search sequences etc. The goal of having two review stages like this is in order to perform as few full runs as possible to minimise our footprint on remote servers.
Again any issues which surface at this stage of review will be communicated back and we will work together to resolve them. When all issues have been resolved or if no issues exist then the data can be added to OpenCorporates
After a bot is accepted it will firstly add its data to OpenCorporates and will then be scheduled to run regularly based on the "frequency" field in the manifest, see the Turbot Specification for more information about the manifest.
If a bot falls over for any reason at any time this will be visible on the "My Bots" page of Turbot, once the reason for the bot's failure has been fixed the bot can be re-pushed and will be reviewed again, the data will be ingested, and the bot will be re-scheduled.