API

Opendata

ls([prefix])

List open datasets on Faculty.

load(path[, copy])

Load an open dataset from Faculty.

faculty_extras.opendata.ls(prefix='')

List open datasets on Faculty.

Parameters
prefixstr, optional

List only open datasets matching this prefix

Returns
list
faculty_extras.opendata.load(path, copy=True)

Load an open dataset from Faculty.

The dataset will be downloaded only if it is not in sync with Faculty. The data is cached in memory for subsequent calls to load().

Parameters
pathstr

Path of file on Faculty

copybool, optional

Return a copy of the data (default: True)

Dataset

faculty_extras.opendata.dataset.dataset_factory(s3_key)

Generate a dataset, inferring the correct class from the file extension.

Parameters
s3_keystr

The path of the dataset inside Faculty opendata

Returns
Dataset
class faculty_extras.opendata.dataset.Dataset(s3_key)

An open dataset stored on Faculty that is cached locally.

This class effectively implements two types of caching to reduce time spent waiting for dataset loads. First, it stores data from Faculty to disk, avoiding repeat downloads. Second, it implements caching at runtime, meaning that each dataset needs only to be loaded once when being used many times.

Parameters
s3_keystr

The path of the dataset on S3

Attributes
local_path

Get the path for local storage of the dataset.

Methods

load([copy])

Load the dataset.