Welcome to ‘namefiles’ documentation!

Name-files is an approach for a standardized file naming for multiple files of different sources with equal formatting, which all are related to the same entity.

A trash panda.

Installation

Install the latest release from pip.

$ pip install namefiles

Basic Usage

At the current implementation namefiles revolves around getting filenames within python scripts, which comply to a file naming convention.

A first use case is using a source filename for a new filename. You might have a source file which is used for a process resulting into a new file, for which a related name is required.

By quickly setting the fresh filename parts a new path can be obtained.

>>> from namefiles import FilenameParts
    >>> source_filename = NameGiver.disassemble("/root/path/A#file.txt")
    >>> target_filename = source_filename.with_parts(
    ...     sub_id="NEW", source_id="filename"
    ... )
    >>> target_filename.to_path()
    PosixPath('/root/path/A.txt')


Another use case is using metadata already carrying the filename parts.
>>> source_filename = NameGiver.disassemble("/root/path/A#file.txt")
>>> target_filename = source_filename.with_parts(
...     sub_id="NEW", source_id="filename"
... )
>>> target_filename.to_path()
PosixPath('/root/path/A.txt')

Another use case is using metadata already carrying the filename parts.

>>> sample_metadata = {
    ...     "identifier": "A",
    ...     "sub_id": "FILE",
    ...     "context": "name",
    ...     "non-filename-field": "Is not for the filename."
    ... }
    >>> from namefiles import FilenameParts
    >>> new_filepath_giver = NameGiver(
    ...     root_path="/root/path", extension=".txt", **sample_metadata
    ... )
    >>> str(new_filepath)
    '/root/path/A#FILE.name.txt'
...     "identifier": "A",
...     "sub_id": "FILE",
...     "context": "name",
...     "non-filename-field": "Is not for the filename."
... }
>>> from namefiles import NameGiver
>>> new_filepath_giver = NameGiver(
...     root_path="/root/path", extension=".txt", **sample_metadata
... )
>>> str(new_filepath)
'/root/path/A#FILE.name.txt'

API reference

namefiles

namefiles.disassemble_filename(target_path)

Disassembles a file’s name into the parts defined by a file naming convention.

namefiles.construct_filename([…])

Constructs a filename using a filename convention.

namefiles.construct_filepath([…])

Constructs a filepath using a file naming convention.

namefiles.extract_filename_parts(…[, …])

Extracts filename parts from a dictionary based by a file naming convention.

namefiles.get_filename_convention([…])

Gets the currently defined file naming convention.

namefiles.get_filename_validator([…])

Returns a filename validator for applying the file naming convention.

namefiles.is_a_filename_part(part_name[, …])

Returns if the part name is within the file naming convention.

namefiles.register_filename_validator(…)

Registers file naming convention.

ANameGiver

namefiles.ANameGiver.set_parts(**filename_parts)

Sets filename parts with new values.

namefiles.ANameGiver.to_path([root_path])

Returns a pathlib.Path of the declared filename parts.

namefiles.ANameGiver.get_filename_validator()

Returns this name givers validator providing the file naming convention.

namefiles.ANameGiver.set_name_part(…)

Sets the value of a convention’s filename part.

namefiles.ANameGiver.disassemble(…)

Disassembles the filename returning ANameGiver.

class namefiles.ANameGiver(**filename_parts)

A Name Giver is the abstract base class, which can be used to define a custom file naming convention. This can achived subclassing ANameGiver and overriding its classmethod get_filename_validator, which needs to return a jsonschema.IValidator.

Notes

jsonschema has no declaration of IValidator. The called methods within namefiles are declared within JsonschemaValidator as a substitution.

Parameters

**filename_parts – Filename parts for the implemented file name convention.

Examples

To enable a custom filename convention you subclass namefiles.ANameGiver and override the namefiles.ANameGiver.get_filename_validator() providing your file naming convention. In this example the naming convention of namefiles is used, which uses the jsonschema draft 7 specification.

>>> from doctestprinter import doctest_print
>>> from jsonschema import Draft7Validator
>>> from namefiles import ANameGiver, get_filename_convention
>>> class MyFilenameParts(ANameGiver):
...     CUSTOM_VALIDATOR = Draft7Validator(get_filename_convention())
...     @classmethod
...     def get_filename_validator(cls) -> FilenameConvention:
...         # Put your custom file naming convention (jsonschema) here
...         return cls.CUSTOM_VALIDATOR
>>> sample_parts = MyFilenameParts.disassemble("A#NAME.txt")
>>> sample_parts
MyFilenameParts(root_path: ., identifier: A, extension: .txt, sub_id: NAME)
>>> str(sample_parts)
'A#NAME.txt'
>>> sample_parts.set_parts(
...     identifier="Zebra", vargroup=["in", "the"], extension=".zoo"
... )
>>> str(sample_parts)
'Zebra#NAME#_in_the.zoo'
>>> sample_parts.set_parts(
...     identifier="Z", sub_id="BRA", vargroup="", extension=""
... )
>>> str(sample_parts)
'Z#BRA'

Implements collections.abc.Mapping

>>> converted_into_dict = dict(sample_parts)
>>> doctest_print(converted_into_dict, max_line_width=70)
{'root_path': '.', 'identifier': 'Z', 'extension': '', 'source_id': '',
'sub_id': 'BRA', 'context': '', 'vargroup': ''}
>>> len(sample_parts)
7
>>> sample_parts["sub_id"]
'BRA'

Disassembling of path and filename

>>> sample_parts = MyFilenameParts.disassemble("/a/path/Z#BRA.txt")
>>> sample_parts
MyFilenameParts(root_path: /a/path, identifier: Z, extension: .txt, sub_id: BRA)
>>> str(sample_parts.to_path())
'/a/path/Z#BRA.txt'
>>> str(sample_parts.to_path(root_path="/another/path"))
'/another/path/Z#BRA.txt'

FilenameParts

namefiles.FilenameParts.set_parts(…)

Sets filename parts with new values.

namefiles.FilenameParts.to_path([root_path])

Returns a pathlib.Path of the declared filename parts.

namefiles.FilenameParts.get_filename_validator()

Returns this name givers validator providing the file naming convention.

namefiles.FilenameParts.set_name_part(…)

Sets the value of a convention’s filename part.

namefiles.FilenameParts.disassemble(…)

Disassembles the filename returning ANameGiver.

namefiles.FilenameParts.identifier

The mandatory entity’s name which relates to multiple files.

namefiles.FilenameParts.sub_id

The sub id is the first branch of the identifier.

namefiles.FilenameParts.source_id

The source id states, where this file came from.

namefiles.FilenameParts.vargroup

The group of variables (vargroup) contains meta attributes.

namefiles.FilenameParts.context

Context of the file’s content.

namefiles.FilenameParts.extension

The common file extension with a leading dot.

class namefiles.FilenameParts(identifier: Optional[str] = None, sub_id: Optional[str] = None, source_id: Optional[str] = None, vargroup: Optional[List[str]] = None, context: Optional[str] = None, extension: Optional[str] = None, root_path: Optional[str] = None, **kwargs)

The filename parts implements the current standard file naming convention. The FilenameParts is the convinient tool to make a new filename based on the latest standard file naming convention.

Parameters
  • identifier – The mandatory entity’s name which relates to multiple files. The identifier is the leading filename part.

  • sub_id – The sub id is the first branch of the identifier.

  • source_id – The source id states, where this file came from.

  • vargroup – The group of variables (vargroup) contains meta attributes.

  • contextContext of the file’s content. What is this about?

  • extension – The extension of this file. The extension states the files format or structure.

  • root_path – The files location.

Examples

The major entry point is the disassemble method, which returns the FilenameParts instance containing all filename parts based on the latest standard file naming convention.

>>> from namefiles import FilenameParts
>>> sample_giver = FilenameParts.disassemble("A#NAME.txt")
>>> sample_giver
FilenameParts(root_path: ., identifier: A, extension: .txt, sub_id: NAME)

The FilenameParts mimics a Mapping and additionally providing the major filename parts as properties.

>>> sample_giver["identifier"]
'A'
>>> sample_giver["sub_id"]
'NAME'
>>> sample_giver.identifier
'A'
>>> sample_giver.identifier = "Zebra"
>>> sample_giver.identifier
'Zebra'

Either convert the instance to a string to get a filename (filepath)

>>> str(sample_giver)
'Zebra#NAME.txt'

or use the FilenameParts.to_path() method to receive a pathlib.PurePath.

Concept

The filename is defined by 6 parts, which take on different contexts, all being related to one entity.

  • identifier: The mandatory name (identification) of an entity.

  • sub_id: A branch of this entity.

  • source_id: The source from which the file (data) origins.

  • vargroup: The possibility to state variables.

  • context: Context of the files content. What is in there, not how it is stored in there. The context must be always accompanied with an extension.

  • extension: The file extension, which should state the format of the file. How is it stored in there.

All filename parts except the identifier are optional.

Within namefiles file naming conventions are defined by a JsonSchema using the python jsonschema module. namefiles proposes a standard naming convention, which is used if no custom naming convention is defined.

The ENBF of the namefiles’s naming convention is

filename     ::= identifier ["#" sub_id] ["#" source_id] ["#" vargroup] ["." context] ["." extension]
identifier   ::= [0-9a-zA-Z-_]{1,36}
sub_id       ::= [0-9A-Z]{1,4}
source_id    ::= [0-9A-Z]{5,12}
vargroup     ::= ("_" var_value])+
var_value    ::= [a-zA-Z0-9,.+-\ ]+
context      ::= [a-zA-Z]+[0-9a-zA-Z-]+
extention    ::= common file extension (.csv, .txt, ...)

Implementation

The recommended namefiles.FilenameParts implements the default namefiles file naming convention, providing access to each part via properties.

FilenameParts.identifier

The mandatory entity’s name which relates to multiple files. The identifier is the leading filename part.

Notes

The identifier has a maximum length of 36 characters and can consist of words [a-zA-Z0-9_] with the addition of the hyphen-minus ‘-‘ (U+002D), which should be the default on keyboards.

Its regular expression ^[0-9a-zA-Z-_]+$

Examples

Minimal to maximal identifier examples.

a                                       # At leas 1 character is needed.
1044e098-7bfb-11eb-9439-0242ac130002    # 36 chars allows a UUID
Returns

str

FilenameParts.sub_id

The sub id is the first branch of the identifier.

Notes

The sub identifier allows uppercase words without the underscore [A-Z0-9] with a maximum length of 4.

Its regular expression is ^[0-9A-Z-]{1,4}+$

The sub identifier’s task is to distinguish different states of the same context. A context in this term could be different video captures of the same object with multiple cameras or just different file versions.

The sub identifier should be seen as a branch of the identifier. Not a version within a sequence.

Examples

Multiple different video captures of the same object.

ant#CAM0.avi
ant#CAM1.avi
ant#CAM2.avi

Different children (versions).

a#1
a#1ST
a#2ND
a#RAW
Returns

str

FilenameParts.source_id

The source id states, where this file came from.

Notes

The source identifier allows words without underscores [a-zA-Z0-9] with the addition of the hyphen-minus ‘-‘ (U+002D), which should be the default on keyboards.

Its regular expression is ^[0-9A-Z-]{5-12}+$

The source identifier states different sources, whenever the context would lead to equal filenames. it might be the name of the program or device which made this file.

Examples

A comparison of sources onto 2 different sub versions of Zeb-a.

Zeb-a#1#canon.jpg
Zeb-a#2#canon.jpg
Zeb-a#1#nikon.jpg
Zeb-a#2#nikon.jpg
Returns

str

FilenameParts.vargroup

The group of variables (vargroup) contains meta attributes.

Notes

Each variable of the group is a string. It allows words [a-zA-Z0-9_] with the addition of:

  • ‘-‘ hyphen-minus (U+002D)

  • ‘+’ plus

  • ‘,’ comma

  • ‘.’ dot

Its regular expression is ^#(_[a-zA-Z0-9+-,. ]+)+$

Examples in which meta attributes are stored in the filename:

  • number of a subsequent sequence e.g. image sequences

  • a date neither being the creation nor the change date

Examples

>>> from namefiles import FilenameParts
>>> FilenameParts.disassemble("Zeb-a#_000000_ffffff_1.9m_no color").vargroup
['000000', 'ffffff', '1.9m', 'no color']
Returns

List[str]

FilenameParts.context

Context of the file’s content. What is this file about?

Notes

The context allows words without underscores [a-zA-Z0-9] starting with alphabetic character.

Its regular expression is ^[a-zA-Z]+[0-9a-zA-Z-]+$

While the file extension just states the formatting of the file like ‘.txt’ being a text file or ‘.csv’ being a specifically formatted text file, they do not state any information about their context.

Returns

str

FilenameParts.extension

The common file extension with a leading dot.

Notes

The extension states how the content is encoded and which structure it has.

Examples

A file ending with ‘.txt’ is a plain text file, which is encoded with ‘utf-8’ in best case.

A file ending with ‘.csv’ is a plain text file, which contains a table having ‘comma seperated values’. Other examples are common formats like .json or .yml.

Instead of creating non-common file endings for custom text based file formats. The text files should end with ‘.txt’. To state the custom content the context file part can be used.

Returns

str

Indices and tables