Tools and Integrations
This page lists user-friendly tools that we’ve developed as part of the Training Provider Outcomes Toolkit for securely collecting, connecting, analyzing, aggregating, and publishing data on wage and employment outcomes for education and training participants. It also lists integrations for working with DataatWork outputs using popular existing tools. Because we are building using the Open Knowledge Foundation’s Data Package standard many of the tools below are general Data Package tools that can be used by institutions beyond the Training Data Package use case. If you are a developer, you can find software libraries to load and manage Skills and Training Data Packages in your language of choice.
- Quick Start Tools for Training Provider Data Packages
- Use Data Packages with …
- Catalog of Tools and Integrations
Quick Start Tools for Training Provider Data Packages
Online Data Package viewer app – provides a nice human-friendly view of a Data Package in seconds.
Data Package your data by creating a
datapackage.json – the online datapackage.json maker creates the
datapackage.json file needed to turn data into a Data Package.
Online validator that checks your
datapackage.json and Training Data Package are good to go.
- Creating and using Training Data Packages in Python coming soon
- Creating and using Skills Data Packages in Python coming soon
- Creating and using Training Data Packages in R coming soon
- data package manager (dpm) - overall library and command line
- datapackage-init - create Data Packages by creating
- datapackage-read - load and access Data Packages (
- datapackage-validate - validate Data Packages (
- datapackage-render - render Data Packages and their views to embeddable HTML, images (png) and more
A comprehensive Python library is available:
Two libraries are available:
- https://github.com/textkit/datapak - work with tabular data packages (lets you download, load or query datasets using SQL via ActiveRecord - thus, works with any SQL database; defaults to an in-memory SQLite database).
- https://github.com/theodi/datapackage.rb – parse and validate both data packages and tabular data packages. (May be obsolete as no updated since Feb 2014)
A validator and storage library for working with JSON Table Schema is available here:
https://github.com/the42/datapackage - provides struct specifications for Data Package as well as a command line tool to create Data Packages.
- R Data Package Library - by rOpenSci
- R Data Package Manager - by Christopher Gandrud
- R Open Data Protocols Library - by QRBC
A function to read data from a Tabular Data Package is available for download from MATLAB Central’s File Exchange.
To contribute to the library, see the project’s GitHub repository.
Data Package Manager (dpm) – https://github.com/okfn/dpm. Comprehensive command line tool.
Use Data Packages with …
These “Using with” examples usually require Tabular Data Packages where the data in the Data Package is stored in CSV.
- https://github.com/frictionlessdata/jsontableschema-sql-py - generic JSON Table Schema to SQL library in Python
- https://github.com/frictionlessdata/datapackage-py - general Python library can be used to automate import of Tabular Data Packages into SQL
- You can also use the Ruby datapak library (see Ruby library section)
In addition to the generic option there is a simple python script (no dependencies) to load a Tabular Data Packages into SQLite.
In addition to the generic options There is a python script (with no dependencies) to load a Tabular Data Package into Postgresql
There is a BIML project that uses datapackage.json to generate SSIS packages that can load the contents of a Tabular Data Package into a SQL Server database. Find out more about SQL Server Integration Services (SSIS).
In progress: fully automated Data Package support (see this issue for updates).
In the meantime you can just open the CSV file by hand!
In progress: Fully automated Data Package support (see this issue for updates).
In the meantime you can just import the CSV files in the Data Package directly.
- https://github.com/frictionlessdata/jsontableschema-bigquery-py - generic JSON Table Schema to BigQuery library in Python
- https://github.com/frictionlessdata/datapackage-py - general Python library can be used to automate import of Tabular Data Packages into BigQuery