File I/O subpackage

All functions of datesy taking care of file I/O are listed here, separated by file_type. Only exception is the first module, providing functions for file selection.

file_selection module

The file_selection module provides multiple supporting functions for interaction with files

datesy.file_IO.file_selection.get_latest_file_from_directory(directory, file_ending=None, pattern=None, regex=None)

Return the latest file_name (optionally filtered) from a directory

Parameters:
  • directory (str) – the directory where to get the latest file_name from
  • file_ending (str, set, optional) – the file_name ending specifying the file_name type
  • pattern (str, optional) – pattern for the file_name to match DataFile_*.json where * could be a date or other strings
  • regex (str, optional) – a regular_expression (regex) for pattern matching
Returns:

the file_name with the latest change date

Return type:

str

datesy.file_IO.file_selection.get_file_list_from_directory(directory, file_ending=None, pattern=None, regex=None)

Return all files (optionally filtered) from directory in a list

Parameters:
  • directory (str) – the directory containing the desired files
  • file_ending (str, set, optional) – the file_name’s ending specifying the file type
  • pattern (str, optional) – pattern for the file_names to match DataFile_*.json where * could be a date or other strings
  • regex (str, optional) –

    a regular_expression (regex) for pattern matching

Returns:

a list of all relative file_name directories

Return type:

list

datesy.file_IO.file_selection.return_file_list_if_path(path, file_ending=None, pattern=None, regex=None, return_always_list=False)

Return all files in directory (optionally specified with options) if path is a directory

Parameters:
  • path (str) – the path to test if directory
  • file_ending (str, set, optional) – the file_name ending specifying the file_name type for the files in the directory
  • pattern (str, optional) – pattern for the file_names in directory to match DataFile_*.json where * could be a date or other strings
  • regex (str, optional) –

    a regular_expression (regex) for pattern matching of the file_names

  • return_always_list (bool, optional) – if a single path shall be returned as in a list
Returns:

if directory the list of files else the path (in a list if return_always_list is set)

Return type:

list, str

datesy.file_IO.file_selection.check_file_name_ending(file_name, ending)

Check if the file_name has the expected file_ending

If one of the provided endings is the file_name’s ending return True, else False

Parameters:
  • file_name (str) – The file_name to check the ending for The file_name may contain a path, so file_name.ending as well as path/to/file_name.ending will work
  • ending (str, set, list) – The desired ending or multiple desired endings For single entries e.g. .json or csv, for multiple endings e.g. ['.json', 'csv']
Returns:

True if the file_name’s ending is in the given ending, else False

Return type:

bool

json_file module

The json_file module takes care of all I/O interactions concerning json files

datesy.file_IO.json_file.load(path)

Load(s) json file(s) and returns the dictionary/-ies Specifying a file_name: one file will be loaded. Specifying a directory: all *.json files will be loaded.

Parameters:path (str) – path to a file_name or directory
Returns:dictionary representing the json {file_name: {data}}
Return type:dict
datesy.file_IO.json_file.load_single(file_name)

Load a single json file

Parameters:file_name (str) – file_name to load from
Returns:the loaded json as a dict {data}
Return type:dict
datesy.file_IO.json_file.load_these(file_name_list)

Load specified json files and return the data in a dictionary with file_name as key

Parameters:file_name_list (list) – list of file_names to load from
Returns:the dictionaries from the files as values of file_name as key {file_name: {data}}
Return type:dict(dict)
datesy.file_IO.json_file.load_all(directory)

Load all json files in the directory and return the data in a dictionary with file_name as key

Parameters:directory (str) – the directory containing the json files
Returns:the dictionaries from the files as values of file_name as key {file_name: {data}}
Return type:dict(dict)
datesy.file_IO.json_file.write(file_name, data, beautify=True, sort=False)

Save json from dict to file

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .json
  • data (dict) – the dictionary to be saved as json
  • beautify (bool, optional) – if the data is represented in single row or human readable presented (default: human readable)
  • sort (bool, optional) – if the keys shall be ordered (default: false)

csv_file module

The csv_file module takes care of all I/O interactions concerning csv files

datesy.file_IO.csv_file.load(path, **kwargs)

Load(s) csv file(s) and returns the rows Specifying a file_name: one file will be loaded. Specifying a directory: all *.csv files will be loaded.

Parameters:
  • path (str) – path to a file_name or directory
  • kwargs (optional) – csv dialect options
Returns:

list of lists if a single file_name was provided: [[row1.1, row1.2]] dict of list of lists if multiple files provided: {file_name : [[row1.1, row1.2]]}

Return type:

list, dict

datesy.file_IO.csv_file.load_single(file_name, **kwargs)

Load a csv file and return the rows

Parameters:
  • file_name (str) – file_name to load from
  • kwargs (optional) – csv dialect options
Returns:

list of lists representing the csv data [[row1.1, row1.2]]

Return type:

list

datesy.file_IO.csv_file.load_these(file_name_list, **kwargs)

Load specified csv files and return the rows in a dictionary with file_name as key

Parameters:
  • file_name_list (list) – list of file_names to load from
  • kwargs (optional) – csv dialect options
Returns:

the rows from the files as values of file_name as key {file_name : [[row1.1, row1.2]]}

Return type:

dict

datesy.file_IO.csv_file.load_all(directory, **kwargs)

Load all csv files in the directory and return the rows in a dictionary with file_name as key

Parameters:
  • directory (str) – the directory containing the csv files
  • kwargs (optional) – csv dialect options
Returns:

the rows from the files as values of file_name as key {file_name : [[row1.1, row1.2]]}

Return type:

dict

datesy.file_IO.csv_file.write(file_name, data, main_key_name=None, main_key_position=0, order=None, if_empty_value=None, **kwargs)

Save a row based document from dict or list to file If presented a dictionary, converting to rows is done by the dict_to_rows method of this package.

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as file_name.csv
  • data (dict, list) – the dictionary or list to be saved as csv
  • main_key_name (str, optional) – if the json or dict does not have the main key as a single key present ({main_element_name: dict}), it needs to be specified
  • main_key_position (int, optional) – the position in csv of the dictionary main key
  • order (dict, list, optional) – for defining a specific order of the keys. if dict, format: {int: str} either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
  • if_empty_value (any, optional) – the value to set when no handling is available default is “delete” leading to be an empty value
  • kwargs (optional) – csv dialect options
datesy.file_IO.csv_file.write_from_rows(file_name, rows, **kwargs)

Save row based document from rows to file

Parameters:
  • file_name (str) – the file_name to save the data under. if no ending is provided, saved as file_name.csv
  • rows (list) – list of lists to write to file_name
  • kwargs (optional) – csv dialect options
datesy.file_IO.csv_file.write_from_dict(file_name, data, main_key_name=None, main_key_position=0, order=None, if_empty_value=None, **kwargs)

Save a row based document from dict to file

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as file_name.csv
  • data (dict) – the dictionary to be saved as csv
  • main_key_name (str, optional) – if the json or dict does not have the main key as a single key present ({main_element_name: dict}), it needs to be specified
  • order (dict {int: str}, list, optional) – for defining a specific order of the keys either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
  • if_empty_value (any, optional) – the value to set when no handling is available default is “delete” leading to be an empty value
  • main_key_position (int, optional) – the position in csv of the dictionary main key
  • kwargs (optional) – csv dialect options

xls_file module

The xls_file module takes care of all I/O interactions concerning xls(x) files

datesy.file_IO.xls_file.load_single_sheet(file_name, sheet=None)

Load a xls(x) file’s (first) sheet to a pandas.DataFrame

Parameters:
  • file_name (str) – file_name to load from
  • sheet (str, optional) – a specified sheet_name to extract. default is first sheet
Returns:

pandas.DataFrame representing the xls(x) file

Return type:

pandas.DataFrame

datesy.file_IO.xls_file.load_these_sheets(file_name, sheets)

Load from a xls(x) file_name the specified sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary

Parameters:
  • file_name (str) – file_name to load from
  • sheets (list) – sheet_names to load
Returns:

dictionary containing the sheet_names as keys and pandas.DataFrame representing the xls(x) sheets {sheet_name: pandas.DataFrame}

Return type:

dict(pandas.DataFrame)

datesy.file_IO.xls_file.load_all_sheets(file_name)

Load from a xls(x) file all its sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary

Parameters:file_name (str) – file_name to load from
Returns:dictionary containing the sheet_names as keys and pandas.DataFrame representing the xls(x) sheets {sheet_name: pandas.DataFrame}
Return type:dict
datesy.file_IO.xls_file.load_these_files(file_name_list)

Load the specified xls(x) files with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary

Parameters:file_name_list (list) – list of file_names to load from
Returns:the data from the sheets in a dictionary with sheet_name as key within again a dictionary with file_name as key {file_name: {sheet_name: pandas.DataFrame}}
Return type:dict
datesy.file_IO.xls_file.load_all_files(directory)

Load all xls(x) files in the directory with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary

Parameters:directory (str) – the directory containing the xlsx files
Returns:the data from the sheets in a dictionary with sheet_name as key within again a dictionary with file_name as key {file_name: {sheet_name: pandas.DataFrame}}
Return type:dict
datesy.file_IO.xls_file.load(path)

Load all xls(x) files in the directory with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary Specifying a file_name: one file will be loaded. Specifying a directory: all *.xls(x) files will be loaded.

Parameters:path (str) – path to a file_name or directory
Returns:dictionary containing the sheets as panda.DataFrames: {file_name: {sheet_name: pandas.DataFrame}}
Return type:dict
datesy.file_IO.xls_file.write_single_sheet_from_DataFrame(file_name, data_frame, sheet_name=None, auto_size_cells=True)

Save a pandas.DataFrame to file

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
  • data_frame (pandas.DataFrame) – pandas.DataFrame to write to file_name
  • sheet_name (str, optional) – a sheet_name containing the data
  • auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
datesy.file_IO.xls_file.write_multi_sheet_from_DataFrames(file_name, data_frames, sheet_order=None, auto_size_cells=True)

Save multiple pandas.DataFrames to one file

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
  • data_frames (dict {sheet_name: DataFrame}) – dict of data_frames
  • sheet_order (dict {int: str}, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
  • auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
datesy.file_IO.xls_file.write_single_sheet_from_dict(file_name, data, main_key_name=None, sheet=None, order=None, inverse=False, auto_size_cells=True)

Save a dictionary ({main_key_name: {data}}) as xlsx document to file Uses the dict_to_pandas_data_frame method of this package for converting the dictionary to pandas.DataFrame.

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
  • data (dict) – the dictionary to be saved as xlsx {main_key_name: {data}}
  • main_key_name (str, optional) – if the json or dict does not have the main key as a single {main_element : dict} present, it needs to be specified
  • sheet (str, optional) – a sheet name for the handling
  • order (dict, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
  • inverse (bool, optional) – if columns and rows shall be switched
  • auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
datesy.file_IO.xls_file.write_multi_sheet_from_dict_of_dicts(file_name, data, order=None, auto_size_cells=True)

Save dictionaries ({sheet_name: {main_key_name: {data}}}) as xlsx document to file Uses the dict_to_pandas_data_frame method of this package for converting the dictionary to pandas.DataFrame.

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
  • data (dict) – the dictionary to be saved as xlsx {sheet_name: {main_key_name: {data}}}
  • order (dict, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
  • auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active

xml_file module

The xml_file module takes care of all I/O interactions concerning xml files

datesy.file_IO.xml_file.load(path)

Load(s) json file(s) and returns the dictionary/-ies Specifying a file_name: one file will be loaded. Specifying a directory: all *.json files will be loaded.

Parameters:path (str) – path to a file_name or directory
Returns:dictionary representing the json {file_name: {data}}
Return type:dict
datesy.file_IO.xml_file.load_single(file_name)

Load a single xml file

Parameters:file_name (str) – file_name to load from
Returns:the xml as ordered dict {collections.OrderedDict}
Return type:dict
datesy.file_IO.xml_file.load_these(file_name_list)

Load specified xml files and return the data in a dictionary with file_name as key

Parameters:file_name_list (list) – list of file_names to load from
Returns:the dictionaries from the files as values of file_name as key {file_name: {collections.OrderedDict}
Return type:dict(collections.OrderedDict)
datesy.file_IO.xml_file.load_all(directory)

Load all xml files in the directory and return the data in a dictionary with file_name as key

Parameters:directory (str) – the directory containing the xml files
Returns:the dictionaries from the files as values of file_name as key {file_name: {collections.OrderedDict}}
Return type:dict(collections.OrderedDict)
datesy.file_IO.xml_file.write(file_name, data, main_key_name=None)

Save xml file from dict or collections.OrderedDict to file

Parameters:
  • file_name (str) – the file_name to save under. if no ending is provided, saved as .xml
  • data (dict, collections.OrderedDict) – the dictionary to be saved as xml
  • main_key_name (str) – if the dict/OrderedDict does not have the main key as a single key present ({main_element_name: dict}), it needs to be specified