explanation_fetcher
ERROR :Failed to open archive '/tmp/tmpccdy6r4tapprov_explain.zip': File is not a zip file
Traceback (most recent call last):
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/explanations/explanation_fetcher.py", line 78, in <module>
spew(dp, itertools.chain(res_iter, [get_explanations(parameters['url'])]))
File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 75, in spew
for rec in res:
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/explanations/explanation_fetcher.py", line 51, in get_explanations
z_archive = zipfile.ZipFile(archive)
File "/usr/local/lib/python3.9/zipfile.py", line 1266, in __init__
self._RealGetContents()
File "/usr/local/lib/python3.9/zipfile.py", line 1333, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Pipeline ID: budget/national/changes/explanations/all
flow
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/specs/../lib/flow.py", line 13, in <module>
flow = flow_module.flow(parameters, datapackage, resources, ctx.stats)
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/original/committee-zipfile.py", line 14, in flow
archive_ = gcl.download(WEIRD_ZIP_FILE, outfile=f'p{YEAR}.zip')
File "/datapackage_pipelines_budgetkey/common/google_chrome.py", line 162, in download
assert False, 'Failed to download file, %r' % downloads
AssertionError: Failed to download file, ['']
DEBUG :[chan 14] Max packet in: 32768 bytes
DEBUG :[chan 14] Max packet out: 32768 bytes
DEBUG :Secsh channel 14 opened.
DEBUG :[chan 14] Sesch channel 14 request ok
DEBUG :[chan 14] EOF received (14)
DEBUG :EOF in transport thread
DEBUG :[chan 14] EOF sent (14)
DEBUG :Dropping user packet because connection is dead.
Pipeline ID: budget/national/changes/original/committee-zipfile
Pipeline ID: budget/national/changes/original/current-year-fixes
Pipeline ID: budget/national/changes/original/national-budget-changes
Pipeline ID: budget/national/changes/processed/national-budget-changes-aggregated
dump.to_sql
ERROR :Traceback (most recent call last):
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
ERROR :self.dialect.do_execute(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
ERROR :cursor.execute(statement, parameters)
ERROR :psycopg2.errors
ERROR :.
ERROR :InternalError_
ERROR ::
ERROR :could not open relation with OID 508842333
ERROR :The above exception was the direct cause of the following exception:
ERROR :Traceback (most recent call last):
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/specs/../lib/dump/to_sql.py", line 15, in <module>
ERROR :spew_flow(flow(ctx.parameters), ctx)
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 181, in __exit__
ERROR :spew(self.datapackage, self.resource_iterator, stats=self.stats)
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 68, in spew
ERROR :for res in resources_iterator:
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/base/datastream_processor.py", line 68, in <genexpr>
ERROR :res_iter = (it if isinstance(it, ResourceWrapper) else ResourceWrapper(res, it)
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/processors/dumpers/dumper_base.py", line 82, in process_resources
ERROR :ret = self.process_resource(
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/processors/dumpers/to_sql.py", line 108, in process_resource
ERROR :storage.delete('')
ERROR :File "/usr/local/lib/python3.9/site-packages/tableschema_sql/storage.py", line 183, in delete
ERROR :self.__reflect()
ERROR :File "/usr/local/lib/python3.9/site-packages/tableschema_sql/storage.py", line 278, in __reflect
ERROR :self.__metadata.reflect(only=only, bind=self.__engine)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 5752, in reflect
ERROR :_reflect_info = insp._get_reflection_info(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 2018, in _get_reflection_info
ERROR :check_constraints=run(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 1994, in run
ERROR :res = meth(filter_names=_fn, **kw)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 1457, in get_multi_check_constraints
ERROR :self.dialect.get_multi_check_constraints(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/base.py", line 4677, in get_multi_check_constraints
ERROR :result = connection.execute(query, params)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
ERROR :return meth(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
ERROR :return connection._execute_clauseelement(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
ERROR :ret = self._execute_context(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
ERROR :return self._exec_single_context(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context
ERROR :self._handle_dbapi_exception(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
ERROR :raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
ERROR :self.dialect.do_execute(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
ERROR :cursor.execute(statement, parameters)
ERROR :sqlalchemy.exc
ERROR :.
ERROR :InternalError
ERROR ::
ERROR :(psycopg2.errors.InternalError_) could not open relation with OID 508842333
[SQL: SELECT pg_catalog.pg_class.relname, pg_catalog.pg_constraint.conname, CASE WHEN (pg_catalog.pg_constraint.oid IS NOT NULL) THEN pg_catalog.pg_get_constraintdef(pg_catalog.pg_constraint.oid, %(pg_get_constraintdef_1)s) END AS anon_1, pg_catalog.pg_description.description
FROM pg_catalog.pg_class LEFT OUTER JOIN pg_catalog.pg_constraint ON pg_catalog.pg_class.oid = pg_catalog.pg_constraint.conrelid AND pg_catalog.pg_constraint.contype = %(contype_1)s LEFT OUTER JOIN pg_catalog.pg_description ON pg_catalog.pg_description.objoid = pg_catalog.pg_constraint.oid JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s, %(param_3)s]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s ORDER BY pg_catalog.pg_class.relname, pg_catalog.pg_constraint.conname]
[parameters: {'pg_get_constraintdef_1': True, 'contype_1': 'c', 'param_1': 'r', 'param_2': 'p', 'param_3': 'f', 'nspname_1': 'pg_catalog'}]
(Background on this error at: https://sqlalche.me/e/20/2j85)
Pipeline ID: budget/national/changes/processed/transactions
We get the data the Ministry of Finance publishes in data.gov.il every year. It comes in XLS format, with one row per TAKANA and phase (original, approved, executed). In this pipeline we create from each triplet a single row that has all the data.
Pipeline ID: budget/national/original/national-budgets
This pipeline joins the different phases of the budget (allocated, revised and executed). In the original file there's a separate row for each of the phases. We like it better as a single row with all phase info. Another thing this pipeline does is to rename the column titles - to more friendly English names. Also, it create rows for all hierarchies - where upper hierarchies (2, 4 & 6 digits) are plain aggregations of the 8-digit items they contain.
Pipeline ID: budget/national/processed/aggregated-yearly
Pipeline ID: budget/national/processed/category-explanations
This pipeline joins budget items that span across years.
Pipeline ID: budget/national/processed/connected-items-explained
This pipeline joins budget items that span across years.
Pipeline ID: budget/national/processed/connected-national-budgets
Pipeline ID: budget/national/processed/just-the-total
Pipeline ID: budget/national/processed/roof-names
This pipeline joins the budget data to itself so that each item has a list of its immediate children.
Pipeline ID: budget/national/processed/with-extras
explanation_fetcher
ERROR :Failed to open archive '/tmp/tmpccdy6r4tapprov_explain.zip': File is not a zip file
Traceback (most recent call last):
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/explanations/explanation_fetcher.py", line 78, in <module>
spew(dp, itertools.chain(res_iter, [get_explanations(parameters['url'])]))
File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 75, in spew
for rec in res:
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/explanations/explanation_fetcher.py", line 51, in get_explanations
z_archive = zipfile.ZipFile(archive)
File "/usr/local/lib/python3.9/zipfile.py", line 1266, in __init__
self._RealGetContents()
File "/usr/local/lib/python3.9/zipfile.py", line 1333, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Pipeline ID: budget/national/changes/explanations/all
flow
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/specs/../lib/flow.py", line 13, in <module>
flow = flow_module.flow(parameters, datapackage, resources, ctx.stats)
File "/datapackage_pipelines_budgetkey/pipelines/budget/national/changes/original/committee-zipfile.py", line 14, in flow
archive_ = gcl.download(WEIRD_ZIP_FILE, outfile=f'p{YEAR}.zip')
File "/datapackage_pipelines_budgetkey/common/google_chrome.py", line 162, in download
assert False, 'Failed to download file, %r' % downloads
AssertionError: Failed to download file, ['']
DEBUG :[chan 14] Max packet in: 32768 bytes
DEBUG :[chan 14] Max packet out: 32768 bytes
DEBUG :Secsh channel 14 opened.
DEBUG :[chan 14] Sesch channel 14 request ok
DEBUG :[chan 14] EOF received (14)
DEBUG :EOF in transport thread
DEBUG :[chan 14] EOF sent (14)
DEBUG :Dropping user packet because connection is dead.
Pipeline ID: budget/national/changes/original/committee-zipfile
dump.to_sql
ERROR :Traceback (most recent call last):
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
ERROR :self.dialect.do_execute(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
ERROR :cursor.execute(statement, parameters)
ERROR :psycopg2.errors
ERROR :.
ERROR :InternalError_
ERROR ::
ERROR :could not open relation with OID 508842333
ERROR :The above exception was the direct cause of the following exception:
ERROR :Traceback (most recent call last):
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/specs/../lib/dump/to_sql.py", line 15, in <module>
ERROR :spew_flow(flow(ctx.parameters), ctx)
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 181, in __exit__
ERROR :spew(self.datapackage, self.resource_iterator, stats=self.stats)
ERROR :File "/usr/local/lib/python3.9/site-packages/datapackage_pipelines/wrapper/wrapper.py", line 68, in spew
ERROR :for res in resources_iterator:
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/base/datastream_processor.py", line 68, in <genexpr>
ERROR :res_iter = (it if isinstance(it, ResourceWrapper) else ResourceWrapper(res, it)
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/processors/dumpers/dumper_base.py", line 82, in process_resources
ERROR :ret = self.process_resource(
ERROR :File "/usr/local/lib/python3.9/site-packages/dataflows/processors/dumpers/to_sql.py", line 108, in process_resource
ERROR :storage.delete('')
ERROR :File "/usr/local/lib/python3.9/site-packages/tableschema_sql/storage.py", line 183, in delete
ERROR :self.__reflect()
ERROR :File "/usr/local/lib/python3.9/site-packages/tableschema_sql/storage.py", line 278, in __reflect
ERROR :self.__metadata.reflect(only=only, bind=self.__engine)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 5752, in reflect
ERROR :_reflect_info = insp._get_reflection_info(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 2018, in _get_reflection_info
ERROR :check_constraints=run(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 1994, in run
ERROR :res = meth(filter_names=_fn, **kw)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 1457, in get_multi_check_constraints
ERROR :self.dialect.get_multi_check_constraints(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/base.py", line 4677, in get_multi_check_constraints
ERROR :result = connection.execute(query, params)
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
ERROR :return meth(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
ERROR :return connection._execute_clauseelement(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
ERROR :ret = self._execute_context(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
ERROR :return self._exec_single_context(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context
ERROR :self._handle_dbapi_exception(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
ERROR :raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
ERROR :self.dialect.do_execute(
ERROR :File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
ERROR :cursor.execute(statement, parameters)
ERROR :sqlalchemy.exc
ERROR :.
ERROR :InternalError
ERROR ::
ERROR :(psycopg2.errors.InternalError_) could not open relation with OID 508842333
[SQL: SELECT pg_catalog.pg_class.relname, pg_catalog.pg_constraint.conname, CASE WHEN (pg_catalog.pg_constraint.oid IS NOT NULL) THEN pg_catalog.pg_get_constraintdef(pg_catalog.pg_constraint.oid, %(pg_get_constraintdef_1)s) END AS anon_1, pg_catalog.pg_description.description
FROM pg_catalog.pg_class LEFT OUTER JOIN pg_catalog.pg_constraint ON pg_catalog.pg_class.oid = pg_catalog.pg_constraint.conrelid AND pg_catalog.pg_constraint.contype = %(contype_1)s LEFT OUTER JOIN pg_catalog.pg_description ON pg_catalog.pg_description.objoid = pg_catalog.pg_constraint.oid JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s, %(param_3)s]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s ORDER BY pg_catalog.pg_class.relname, pg_catalog.pg_constraint.conname]
[parameters: {'pg_get_constraintdef_1': True, 'contype_1': 'c', 'param_1': 'r', 'param_2': 'p', 'param_3': 'f', 'nspname_1': 'pg_catalog'}]
(Background on this error at: https://sqlalche.me/e/20/2j85)
Pipeline ID: budget/national/changes/processed/transactions
Pipeline ID: budget/national/changes/original/current-year-fixes
Pipeline ID: budget/national/changes/original/national-budget-changes
Pipeline ID: budget/national/changes/processed/national-budget-changes-aggregated
We get the data the Ministry of Finance publishes in data.gov.il every year. It comes in XLS format, with one row per TAKANA and phase (original, approved, executed). In this pipeline we create from each triplet a single row that has all the data.
Pipeline ID: budget/national/original/national-budgets
This pipeline joins the different phases of the budget (allocated, revised and executed). In the original file there's a separate row for each of the phases. We like it better as a single row with all phase info. Another thing this pipeline does is to rename the column titles - to more friendly English names. Also, it create rows for all hierarchies - where upper hierarchies (2, 4 & 6 digits) are plain aggregations of the 8-digit items they contain.
Pipeline ID: budget/national/processed/aggregated-yearly
Pipeline ID: budget/national/processed/category-explanations
This pipeline joins budget items that span across years.
Pipeline ID: budget/national/processed/connected-items-explained
This pipeline joins budget items that span across years.
Pipeline ID: budget/national/processed/connected-national-budgets
Pipeline ID: budget/national/processed/just-the-total
Pipeline ID: budget/national/processed/roof-names
This pipeline joins the budget data to itself so that each item has a list of its immediate children.
Pipeline ID: budget/national/processed/with-extras