API

cp(source_path, destination_path[, …])

Copy a file or directory within a project’s datasets.

etag(project_path[, project_id, object_client])

Get a unique identifier for the current version of a file.

get(project_path, local_path[, project_id, …])

Copy from a project’s datasets to the local filesystem.

glob(pattern[, prefix, project_id, …])

List contents of project datasets that match a glob pattern.

ls([prefix, project_id, show_hidden, …])

List contents of project datasets.

mv(source_path, destination_path[, …])

Move a file or directory within a project’s datasets.

open(project_path[, mode, temp_dir, project_id])

Open a file from a project’s datasets for reading.

put(local_path, project_path[, project_id, …])

Copy from the local filesystem to a project’s datasets.

rm(project_path[, project_id, recursive, …])

Remove a file or directory from the project directory.

rmdir(project_path[, project_id, object_client])

Remove an empty directory from the project datasets.

Query, read and write Faculty datasets.

faculty.datasets.cp(source_path, destination_path, project_id=None, recursive=False, object_client=None)

Copy a file or directory within a project’s datasets.

Parameters
source_pathstr

The source path in the project datasets to copy.

destination_pathstr

The destination path in the project datasets.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

recursivebool, optional

If True, allows copying directories like a recursive copy in a filesystem. By default the action is not recursive.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

faculty.datasets.etag(project_path, project_id=None, object_client=None)

Get a unique identifier for the current version of a file.

Parameters
project_pathstr

The path in the project datasets.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

Returns
str
faculty.datasets.get(project_path, local_path, project_id=None, object_client=None)

Copy from a project’s datasets to the local filesystem.

Parameters
project_pathstr

The source path in the project datasets to copy.

local_pathstr or os.PathLike

The destination path in the local filesystem.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

faculty.datasets.glob(pattern, prefix='/', project_id=None, show_hidden=False, object_client=None)

List contents of project datasets that match a glob pattern.

Parameters
patternstr

The pattern that contents need to match.

prefixstr, optional

List only files in the project datasets that have this prefix. Default behaviour is to list all files.

project_idstr, optional

The project to list files from. You need to have access to this project for it to work. Defaults to the project set by SHERLOCK_PROJECT_ID in your environment.

show_hiddenbool, optional

Include hidden files in the output. Defaults to False.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

Returns
list

The list of files from the project that match the glob pattern.

faculty.datasets.ls(prefix='/', project_id=None, show_hidden=False, object_client=None)

List contents of project datasets.

Parameters
prefixstr, optional

List only files in the datasets matching this prefix. Default behaviour is to list all files.

project_idstr, optional

The project to list files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

show_hiddenbool, optional

Include hidden files in the output. Defaults to False.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

Returns
list

The list of files from the project datasets.

faculty.datasets.mv(source_path, destination_path, project_id=None, object_client=None)

Move a file or directory within a project’s datasets.

Parameters
source_pathstr

The source path in the project datasets to move.

destination_pathstr

The destination path in the project datasets.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

faculty.datasets.open(project_path, mode='r', temp_dir=None, project_id=None, **kwargs)

Open a file from a project’s datasets for reading.

This downloads the file into a temporary directory before opening it, so if your files are very large, this function can take a long time.

Parameters
project_pathstr

The path of the file in the project’s datasets to open.

modestr

The opening mode, either ‘r’ or ‘rb’. This is passed down to the standard python open function. Writing is currently not supported.

temp_dirstr

A directory on the local filesystem where you would like the file to be saved into temporarily. Note that on SherlockML servers, the default temporary directory can break with large files, so if your file is upwards of 2GB, it is recommended to specify temp_dir=’/project’.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

faculty.datasets.put(local_path, project_path, project_id=None, object_client=None)

Copy from the local filesystem to a project’s datasets.

Parameters
local_pathstr or os.PathLike

The source path in the local filesystem to copy.

project_pathstr

The destination path in the project directory.

project_idstr, optional

The project to put files in. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

faculty.datasets.rm(project_path, project_id=None, recursive=False, object_client=None)

Remove a file or directory from the project directory.

Parameters
project_pathstr

The path in the project datasets to remove.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

recursivebool, optional

If True, allows deleting directories like a recursive delete in a filesystem. By default the action is not recursive.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.

faculty.datasets.rmdir(project_path, project_id=None, object_client=None)

Remove an empty directory from the project datasets.

Parameters
remote_pathstr

The path of the directory to remove.

project_idstr, optional

The project to get files from. You need to have access to this project for it to work. Defaults to the project set by FACULTY_PROJECT_ID in your environment.

object_clientfaculty.clients.object.ObjectClient, optional

Advanced - can be used to benefit from caching in chain interactions with datasets.