ButlerURI¶
-
class
lsst.daf.butler.
ButlerURI
¶ Bases:
object
Convenience wrapper around URI parsers.
Provides access to URI components and can convert file paths into absolute path URIs. Scheme-less URIs are treated as if they are local file system paths and are converted to absolute URIs.
A specialist subclass is created for each supported URI scheme.
Parameters: - uri :
str
orurllib.parse.ParseResult
URI in string form. Can be scheme-less if referring to a local filesystem path.
- root :
str
orButlerURI
, optional When fixing up a relative path in a
file
scheme or if scheme-less, use this as the root. Must be absolute. IfNone
the current working directory will be used. Can be a file URI.- forceAbsolute :
bool
, optional If
True
, scheme-less relative URI will be converted to an absolute path using afile
scheme. IfFalse
scheme-less URI will remain scheme-less and will not be updated tofile
or absolute path.- forceDirectory: `bool`, optional
If
True
forces the URI to end with a separator, otherwise given URI is interpreted as is.- isTemporary :
bool
, optional If
True
indicates that this URI points to a temporary resource.
Attributes Summary
fragment
Return the fragment component of the URI. isLocal
If True
this URI refers to a local file.is_root
Return whether this URI points to the root of the network location. netloc
Return the URI network location. ospath
Return the path component of the URI localized to current OS. params
Return any parameters included in the URI. path
Return the path component of the URI. query
Return any query strings included in the URI. quotePaths
True if path-like elements modifying a URI should be quoted. relativeToPathRoot
Return path relative to network location. scheme
Return the URI scheme. transferDefault
Default mode to use for transferring if auto
is specified.transferModes
Transfer modes supported by this implementation. unquoted_path
Return path component of the URI with any URI quoting reversed. Methods Summary
abspath
()Return URI using an absolute path. as_local
()Return the location of the (possibly remote) resource as local file. basename
()Return the base name, last element of path, of the URI. dirname
()Return the directory component of the path as a new ButlerURI
.exists
()Indicate that the resource is available. findFileResources
(candidates, …)Get all the files from a list of values. getExtension
()Return the file extension(s) associated with this URI path. geturl
()Return the URI in string form. isabs
()Indicate that the resource is fully specified. isdir
()Return True if this URI looks like a directory, else False. join
(path, …)Return new ButlerURI
with additional path components.mkdir
()For a dir-like URI, create the directory resource if needed. parent
()Return a ButlerURI
of the parent directory.read
(size)Open the resource and return the contents in bytes. relative_to
(other)Return the relative path from this URI to the other URI. remove
()Remove the resource. replace
(forceDirectory, **kwargs)Return new ButlerURI
with specified components replaced.size
()For non-dir-like URI, return the size of the resource. split
()Split URI into head and tail. transfer_from
(src, transfer, overwrite, …)Transfer the current resource to a new location. updatedExtension
(ext)Return a new ButlerURI
with updated file extension.updatedFile
(newfile)Return new URI with an updated final component of the path. walk
(file_filter, re.Pattern, None] = None)Walk the directory tree returning matching files and directories. write
(data, overwrite)Write the supplied bytes to the new resource. Attributes Documentation
-
fragment
¶ Return the fragment component of the URI.
-
is_root
¶ Return whether this URI points to the root of the network location.
This means that the path components refers to the top level.
-
netloc
¶ Return the URI network location.
-
ospath
¶ Return the path component of the URI localized to current OS.
-
params
¶ Return any parameters included in the URI.
-
path
¶ Return the path component of the URI.
-
query
¶ Return any query strings included in the URI.
-
quotePaths
= True¶ True if path-like elements modifying a URI should be quoted.
All non-schemeless URIs have to internally use quoted paths. Therefore if a new file name is given (e.g. to updatedFile or join) a decision must be made whether to quote it to be consistent.
-
relativeToPathRoot
¶ Return path relative to network location.
Effectively, this is the path property with posix separator stripped from the left hand side of the path.
Always unquotes.
-
scheme
¶ Return the URI scheme.
Notes
(
://
is not part of the scheme).
-
transferDefault
= 'copy'¶ Default mode to use for transferring if
auto
is specified.
-
transferModes
= ('copy', 'auto', 'move')¶ Transfer modes supported by this implementation.
Move is special in that it is generally a copy followed by an unlink. Whether that unlink works depends critically on whether the source URI implements unlink. If it does not the move will be reported as a failure.
-
unquoted_path
¶ Return path component of the URI with any URI quoting reversed.
Methods Documentation
-
abspath
() → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return URI using an absolute path.
Returns: - abs :
ButlerURI
Absolute URI. For non-schemeless URIs this always returns itself. Schemeless URIs are upgraded to file URIs.
- abs :
-
as_local
() → Iterator[lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI]¶ Return the location of the (possibly remote) resource as local file.
Yields: - local :
ButlerURI
If this is a remote resource, it will be a copy of the resource on the local file system, probably in a temporary directory. For a local resource this should be the actual path to the resource.
Notes
The context manager will automatically delete any local temporary file.
Examples
Should be used as a context manager:
with uri.as_local() as local: ospath = local.ospath
- local :
-
basename
() → str¶ Return the base name, last element of path, of the URI.
Returns: - tail :
str
Last part of the path attribute. Trail will be empty if path ends on a separator.
Notes
If URI ends on a slash returns an empty string. This is the second element returned by
split()
.Equivalent of
os.path.basename()`
.- tail :
-
dirname
() → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return the directory component of the path as a new
ButlerURI
.Returns: - head :
ButlerURI
Everything except the tail of path attribute, expanded and normalized as per ButlerURI rules.
Notes
Equivalent of
os.path.dirname()
.- head :
-
exists
() → bool¶ Indicate that the resource is available.
Returns:
-
classmethod
findFileResources
(candidates: Iterable[Union[str, lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI]], file_filter: Optional[str] = None, grouped: bool = False) → Iterator[Union[lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI, Iterator[lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI]]]¶ Get all the files from a list of values.
Parameters: - candidates : iterable [
str
orButlerURI
] The files to return and directories in which to look for files to return.
- file_filter :
str
, optional The regex to use when searching for files within directories. By default returns all the found files.
- grouped :
bool
, optional If
True
the results will be grouped by directory and each yielded value will be an iterator over URIs. IfFalse
each URI will be returned separately.
Yields: - found_file: `ButlerURI`
The passed-in URIs and URIs found in passed-in directories. If grouping is enabled, each of the yielded values will be an iterator yielding members of the group. Files given explicitly will be returned as a single group at the end.
Notes
If a value is a file it is yielded immediately. If a value is a directory, all the files in the directory (recursively) that match the regex will be yielded in turn.
- candidates : iterable [
-
getExtension
() → str¶ Return the file extension(s) associated with this URI path.
Returns: - ext :
str
The file extension (including the
.
). Can be empty string if there is no file extension. Usually returns only the last file extension unless there is a special extension modifier indicating file compression, in which case the combined extension (e.g..fits.gz
) will be returned.
- ext :
-
isabs
() → bool¶ Indicate that the resource is fully specified.
For non-schemeless URIs this is always true.
Returns:
-
isdir
() → bool¶ Return True if this URI looks like a directory, else False.
-
join
(path: Union[str, lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI]) → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return new
ButlerURI
with additional path components.Parameters: - path :
str
,ButlerURI
Additional file components to append to the current URI. Assumed to include a file at the end. Will be quoted depending on the associated URI scheme. If the path looks like a URI with a scheme referring to an absolute location, it will be returned directly (matching the behavior of
os.path.join()
). It can also be aButlerURI
.
Returns: - new :
ButlerURI
New URI with any file at the end replaced with the new path components.
Notes
Schemeless URIs assume local path separator but all other URIs assume POSIX separator if the supplied path has directory structure. It may be this never becomes a problem but datastore templates assume POSIX separator is being used.
Currently, if the join path is given as an absolute scheme-less URI it will be returned as an absolute
file:
URI even if the URI it is being joined to is non-file.- path :
-
mkdir
() → None¶ For a dir-like URI, create the directory resource if needed.
-
parent
() → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return a
ButlerURI
of the parent directory.Returns: Notes
For a file-like URI this will be the same as calling
dirname()
.
-
read
(size: int = -1) → bytes¶ Open the resource and return the contents in bytes.
Parameters: - size :
int
, optional The number of bytes to read. Negative or omitted indicates that all data should be read.
- size :
-
relative_to
(other: lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI) → Optional[str]¶ Return the relative path from this URI to the other URI.
Parameters: - other :
ButlerURI
URI to use to calculate the relative path. Must be a parent of this URI.
Returns: - other :
-
remove
() → None¶ Remove the resource.
-
replace
(forceDirectory: bool = False, **kwargs) → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return new
ButlerURI
with specified components replaced.Parameters: - forceDirectory :
bool
Parameter passed to ButlerURI constructor to force this new URI to be dir-like.
- kwargs :
dict
Components of a
urllib.parse.ParseResult
that should be modified for the newly-createdButlerURI
.
Returns: Notes
Does not, for now, allow a change in URI scheme.
- forceDirectory :
-
size
() → int¶ For non-dir-like URI, return the size of the resource.
Returns: - sz :
int
The size in bytes of the resource associated with this URI. Returns 0 if dir-like.
- sz :
-
split
() → Tuple[lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI, str]¶ Split URI into head and tail.
Returns: - head: `ButlerURI`
Everything leading up to tail, expanded and normalized as per ButlerURI rules.
- tail :
str
Last
self.path
component. Tail will be empty if path ends on a separator. Tail will never contain separators. It will be unquoted.
Notes
Equivalent to
os.path.split()
where head preserves the URI components.
-
transfer_from
(src: ButlerURI, transfer: str, overwrite: bool = False, transaction: Optional[Union[DatastoreTransaction, NoTransaction]] = None) → None¶ Transfer the current resource to a new location.
Parameters: - src :
ButlerURI
Source URI.
- transfer :
str
Mode to use for transferring the resource. Generically there are many standard options: copy, link, symlink, hardlink, relsymlink. Not all URIs support all modes.
- overwrite :
bool
, optional Allow an existing file to be overwritten. Defaults to
False
.- transaction :
DatastoreTransaction
, optional A transaction object that can (depending on implementation) rollback transfers on error. Not guaranteed to be implemented.
Notes
Conceptually this is hard to scale as the number of URI schemes grow. The destination URI is more important than the source URI since that is where all the transfer modes are relevant (with the complication that “move” deletes the source).
Local file to local file is the fundamental use case but every other scheme has to support “copy” to local file (with implicit support for “move”) and copy from local file. All the “link” options tend to be specific to local file systems.
“move” is a “copy” where the remote resource is deleted at the end. Whether this works depends on the source URI rather than the destination URI. Reverting a move on transaction rollback is expected to be problematic if a remote resource was involved.
- src :
-
updatedExtension
(ext: Optional[str]) → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return a new
ButlerURI
with updated file extension.All file extensions are replaced.
Parameters: Returns: - updated :
ButlerURI
URI with the specified extension. Can return itself if no extension was specified.
- updated :
-
updatedFile
(newfile: str) → lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI¶ Return new URI with an updated final component of the path.
Parameters: - newfile :
str
File name with no path component.
Returns: - updated :
ButlerURI
Notes
Forces the ButlerURI.dirLike attribute to be false. The new file path will be quoted if necessary.
- newfile :
-
walk
(file_filter: Union[str, re.Pattern, None] = None) → Iterator[Union[List[T], Tuple[lsst.daf.butler.core._butlerUri._butlerUri.ButlerURI, List[str], List[str]]]]¶ Walk the directory tree returning matching files and directories.
Parameters: - file_filter :
str
orre.Pattern
, optional Regex to filter out files from the list before it is returned.
Yields: - file_filter :
-
write
(data: bytes, overwrite: bool = True) → None¶ Write the supplied bytes to the new resource.
Parameters:
- uri :