API

Opendata

ls([prefix]) List open datasets on Faculty.
load(path[, copy]) Load an open dataset from Faculty.
faculty_extras.opendata.ls(prefix='')

List open datasets on Faculty.

Parameters:
prefix : str, optional

List only open datasets matching this prefix

Returns:
list
faculty_extras.opendata.load(path, copy=True)

Load an open dataset from Faculty.

The dataset will be downloaded only if it is not in sync with Faculty. The data is cached in memory for subsequent calls to load().

Parameters:
path : str

Path of file on Faculty

copy : bool, optional

Return a copy of the data (default: True)

Dataset

faculty_extras.opendata.dataset.dataset_factory(s3_key)

Generate a dataset, inferring the correct class from the file extension.

Parameters:
s3_key : str

The path of the dataset inside Faculty opendata

Returns:
Dataset
class faculty_extras.opendata.dataset.Dataset(s3_key)

An open dataset stored on Faculty that is cached locally.

This class effectively implements two types of caching to reduce time spent waiting for dataset loads. First, it stores data from Faculty to disk, avoiding repeat downloads. Second, it implements caching at runtime, meaning that each dataset needs only to be loaded once when being used many times.

Parameters:
s3_key : str

The path of the dataset on S3

Attributes:
local_path

Get the path for local storage of the dataset.

Methods

load([copy]) Load the dataset.