File I/O subpackage¶
All functions of datesy taking care of file I/O are listed here, separated by file_type. Only exception is the first module, providing functions for file selection.
file_selection module¶
The file_selection module provides multiple supporting functions for interaction with files
-
datesy.file_IO.file_selection.
get_latest_file_from_directory
(directory, file_ending=None, pattern=None, regex=None)¶ Return the latest file_name (optionally filtered) from a directory
Parameters: - directory (str) – the directory where to get the latest file_name from
- file_ending (str, set, optional) – the file_name ending specifying the file_name type
- pattern (str, optional) – pattern for the file_name to match
DataFile_*.json
where*
could be a date or other strings - regex (str, optional) – a regular_expression (regex) for pattern matching
Returns: the file_name with the latest change date
Return type: str
-
datesy.file_IO.file_selection.
get_file_list_from_directory
(directory, file_ending=None, pattern=None, regex=None)¶ Return all files (optionally filtered) from directory in a list
Parameters: - directory (str) – the directory containing the desired files
- file_ending (str, set, optional) – the file_name’s ending specifying the file type
- pattern (str, optional) – pattern for the file_names to match
DataFile_*.json
where*
could be a date or other strings - regex (str, optional) –
a regular_expression (regex) for pattern matching
Returns: a list of all relative file_name directories
Return type: list
-
datesy.file_IO.file_selection.
return_file_list_if_path
(path, file_ending=None, pattern=None, regex=None, return_always_list=False)¶ Return all files in directory (optionally specified with options) if path is a directory
Parameters: - path (str) – the path to test if directory
- file_ending (str, set, optional) – the file_name ending specifying the file_name type for the files in the directory
- pattern (str, optional) – pattern for the file_names in directory to match
DataFile_*.json
where*
could be a date or other strings - regex (str, optional) –
a regular_expression (regex) for pattern matching of the file_names
- return_always_list (bool, optional) – if a single path shall be returned as in a list
Returns: if directory the list of files else the path (in a list if return_always_list is set)
Return type: list, str
-
datesy.file_IO.file_selection.
check_file_name_ending
(file_name, ending)¶ Check if the file_name has the expected file_ending
If one of the provided endings is the file_name’s ending return True, else False
Parameters: - file_name (str) – The file_name to check the ending for
The file_name may contain a path, so
file_name.ending
as well aspath/to/file_name.ending
will work - ending (str, set, list) – The desired ending or multiple desired endings
For single entries e.g.
.json
orcsv
, for multiple endings e.g.['.json', 'csv']
Returns: True if the file_name’s ending is in the given ending, else False
Return type: bool
- file_name (str) – The file_name to check the ending for
The file_name may contain a path, so
json_file module¶
The json_file module takes care of all I/O interactions concerning json files
-
datesy.file_IO.json_file.
load
(path)¶ Load(s) json file(s) and returns the dictionary/-ies Specifying a file_name: one file will be loaded. Specifying a directory: all *.json files will be loaded.
Parameters: path (str) – path to a file_name or directory Returns: dictionary representing the json {file_name: {data}}
Return type: dict
-
datesy.file_IO.json_file.
load_single
(file_name)¶ Load a single json file
Parameters: file_name (str) – file_name to load from Returns: the loaded json as a dict {data}
Return type: dict
-
datesy.file_IO.json_file.
load_these
(file_name_list)¶ Load specified json files and return the data in a dictionary with file_name as key
Parameters: file_name_list (list) – list of file_names to load from Returns: the dictionaries from the files as values of file_name as key {file_name: {data}}
Return type: dict(dict)
-
datesy.file_IO.json_file.
load_all
(directory)¶ Load all json files in the directory and return the data in a dictionary with file_name as key
Parameters: directory (str) – the directory containing the json files Returns: the dictionaries from the files as values of file_name as key {file_name: {data}}
Return type: dict(dict)
-
datesy.file_IO.json_file.
write
(file_name, data, beautify=True, sort=False)¶ Save json from dict to file
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .json
- data (dict) – the dictionary to be saved as json
- beautify (bool, optional) – if the data is represented in single row or human readable presented (default: human readable)
- sort (bool, optional) – if the keys shall be ordered (default: false)
csv_file module¶
The csv_file module takes care of all I/O interactions concerning csv files
-
datesy.file_IO.csv_file.
load
(path, **kwargs)¶ Load(s) csv file(s) and returns the rows Specifying a file_name: one file will be loaded. Specifying a directory: all *.csv files will be loaded.
Parameters: - path (str) – path to a file_name or directory
- kwargs (optional) – csv dialect options
Returns: list of lists if a single file_name was provided:
[[row1.1, row1.2]]
dict of list of lists if multiple files provided:{file_name : [[row1.1, row1.2]]}
Return type: list, dict
-
datesy.file_IO.csv_file.
load_single
(file_name, **kwargs)¶ Load a csv file and return the rows
Parameters: - file_name (str) – file_name to load from
- kwargs (optional) – csv dialect options
Returns: list of lists representing the csv data
[[row1.1, row1.2]]
Return type: list
-
datesy.file_IO.csv_file.
load_these
(file_name_list, **kwargs)¶ Load specified csv files and return the rows in a dictionary with file_name as key
Parameters: - file_name_list (list) – list of file_names to load from
- kwargs (optional) – csv dialect options
Returns: the rows from the files as values of file_name as key
{file_name : [[row1.1, row1.2]]}
Return type: dict
-
datesy.file_IO.csv_file.
load_all
(directory, **kwargs)¶ Load all csv files in the directory and return the rows in a dictionary with file_name as key
Parameters: - directory (str) – the directory containing the csv files
- kwargs (optional) – csv dialect options
Returns: the rows from the files as values of file_name as key
{file_name : [[row1.1, row1.2]]}
Return type: dict
-
datesy.file_IO.csv_file.
write
(file_name, data, main_key_name=None, main_key_position=0, order=None, if_empty_value=None, **kwargs)¶ Save a row based document from dict or list to file If presented a dictionary, converting to rows is done by the dict_to_rows method of this package.
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as file_name.csv
- data (dict, list) – the dictionary or list to be saved as csv
- main_key_name (str, optional) – if the json or dict does not have the main key as a single key present (
{main_element_name: dict}
), it needs to be specified - main_key_position (int, optional) – the position in csv of the dictionary main key
- order (dict, list, optional) – for defining a specific order of the keys. if dict, format:
{int: str}
either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list - if_empty_value (any, optional) – the value to set when no handling is available default is “delete” leading to be an empty value
- kwargs (optional) – csv dialect options
-
datesy.file_IO.csv_file.
write_from_rows
(file_name, rows, **kwargs)¶ Save row based document from rows to file
Parameters: - file_name (str) – the file_name to save the data under. if no ending is provided, saved as file_name.csv
- rows (list) – list of lists to write to file_name
- kwargs (optional) – csv dialect options
-
datesy.file_IO.csv_file.
write_from_dict
(file_name, data, main_key_name=None, main_key_position=0, order=None, if_empty_value=None, **kwargs)¶ Save a row based document from dict to file
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as file_name.csv
- data (dict) – the dictionary to be saved as csv
- main_key_name (str, optional) – if the json or dict does not have the main key as a single key present (
{main_element_name: dict}
), it needs to be specified - order (dict {int: str}, list, optional) – for defining a specific order of the keys either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
- if_empty_value (any, optional) – the value to set when no handling is available default is “delete” leading to be an empty value
- main_key_position (int, optional) – the position in csv of the dictionary main key
- kwargs (optional) – csv dialect options
xls_file module¶
The xls_file module takes care of all I/O interactions concerning xls(x) files
-
datesy.file_IO.xls_file.
load_single_sheet
(file_name, sheet=None)¶ Load a xls(x) file’s (first) sheet to a pandas.DataFrame
Parameters: - file_name (str) – file_name to load from
- sheet (str, optional) – a specified sheet_name to extract. default is first sheet
Returns: pandas.DataFrame representing the xls(x) file
Return type: pandas.DataFrame
-
datesy.file_IO.xls_file.
load_these_sheets
(file_name, sheets)¶ Load from a xls(x) file_name the specified sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary
Parameters: - file_name (str) – file_name to load from
- sheets (list) – sheet_names to load
Returns: dictionary containing the sheet_names as keys and pandas.DataFrame representing the xls(x) sheets
{sheet_name: pandas.DataFrame}
Return type: dict(pandas.DataFrame)
-
datesy.file_IO.xls_file.
load_all_sheets
(file_name)¶ Load from a xls(x) file all its sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary
Parameters: file_name (str) – file_name to load from Returns: dictionary containing the sheet_names as keys and pandas.DataFrame representing the xls(x) sheets {sheet_name: pandas.DataFrame}
Return type: dict
-
datesy.file_IO.xls_file.
load_these_files
(file_name_list)¶ Load the specified xls(x) files with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary
Parameters: file_name_list (list) – list of file_names to load from Returns: the data from the sheets in a dictionary with sheet_name as key within again a dictionary with file_name as key {file_name: {sheet_name: pandas.DataFrame}}
Return type: dict
-
datesy.file_IO.xls_file.
load_all_files
(directory)¶ Load all xls(x) files in the directory with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary
Parameters: directory (str) – the directory containing the xlsx files Returns: the data from the sheets in a dictionary with sheet_name as key within again a dictionary with file_name as key {file_name: {sheet_name: pandas.DataFrame}}
Return type: dict
-
datesy.file_IO.xls_file.
load
(path)¶ Load all xls(x) files in the directory with all their sheets to a pandas.DataFrame as values to sheet_names as keys in a dictionary Specifying a file_name: one file will be loaded. Specifying a directory: all *.xls(x) files will be loaded.
Parameters: path (str) – path to a file_name or directory Returns: dictionary containing the sheets as panda.DataFrames: {file_name: {sheet_name: pandas.DataFrame}}
Return type: dict
-
datesy.file_IO.xls_file.
write_single_sheet_from_DataFrame
(file_name, data_frame, sheet_name=None, auto_size_cells=True)¶ Save a pandas.DataFrame to file
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
- data_frame (pandas.DataFrame) – pandas.DataFrame to write to file_name
- sheet_name (str, optional) – a sheet_name containing the data
- auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
-
datesy.file_IO.xls_file.
write_multi_sheet_from_DataFrames
(file_name, data_frames, sheet_order=None, auto_size_cells=True)¶ Save multiple pandas.DataFrames to one file
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
- data_frames (dict {sheet_name: DataFrame}) – dict of data_frames
- sheet_order (dict {int: str}, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
- auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
-
datesy.file_IO.xls_file.
write_single_sheet_from_dict
(file_name, data, main_key_name=None, sheet=None, order=None, inverse=False, auto_size_cells=True)¶ Save a dictionary (
{main_key_name: {data}}
) as xlsx document to file Uses the dict_to_pandas_data_frame method of this package for converting the dictionary to pandas.DataFrame.Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
- data (dict) – the dictionary to be saved as xlsx
{main_key_name: {data}}
- main_key_name (str, optional) – if the json or dict does not have the main key as a single {main_element : dict} present, it needs to be specified
- sheet (str, optional) – a sheet name for the handling
- order (dict, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
- inverse (bool, optional) – if columns and rows shall be switched
- auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
-
datesy.file_IO.xls_file.
write_multi_sheet_from_dict_of_dicts
(file_name, data, order=None, auto_size_cells=True)¶ Save dictionaries (
{sheet_name: {main_key_name: {data}}}
) as xlsx document to file Uses the dict_to_pandas_data_frame method of this package for converting the dictionary to pandas.DataFrame.Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .xlsx
- data (dict) – the dictionary to be saved as xlsx
{sheet_name: {main_key_name: {data}}}
- order (dict, list, optional) – either a dictionary with the specified positions in a dictionary with positions as keys (integers) or in a list
- auto_size_cells (bool, optional) – if the auto-sizing of the cells shall be active
xml_file module¶
The xml_file module takes care of all I/O interactions concerning xml files
-
datesy.file_IO.xml_file.
load
(path)¶ Load(s) json file(s) and returns the dictionary/-ies Specifying a file_name: one file will be loaded. Specifying a directory: all *.json files will be loaded.
Parameters: path (str) – path to a file_name or directory Returns: dictionary representing the json {file_name: {data}}
Return type: dict
-
datesy.file_IO.xml_file.
load_single
(file_name)¶ Load a single xml file
Parameters: file_name (str) – file_name to load from Returns: the xml as ordered dict {collections.OrderedDict}
Return type: dict
-
datesy.file_IO.xml_file.
load_these
(file_name_list)¶ Load specified xml files and return the data in a dictionary with file_name as key
Parameters: file_name_list (list) – list of file_names to load from Returns: the dictionaries from the files as values of file_name as key {file_name: {collections.OrderedDict}
Return type: dict(collections.OrderedDict)
-
datesy.file_IO.xml_file.
load_all
(directory)¶ Load all xml files in the directory and return the data in a dictionary with file_name as key
Parameters: directory (str) – the directory containing the xml files Returns: the dictionaries from the files as values of file_name as key {file_name: {collections.OrderedDict}}
Return type: dict(collections.OrderedDict)
-
datesy.file_IO.xml_file.
write
(file_name, data, main_key_name=None)¶ Save xml file from dict or collections.OrderedDict to file
Parameters: - file_name (str) – the file_name to save under. if no ending is provided, saved as .xml
- data (dict, collections.OrderedDict) – the dictionary to be saved as xml
- main_key_name (str) – if the dict/OrderedDict does not have the main key as a single key present (
{main_element_name: dict}
), it needs to be specified