IO Modules
Functions for reading and writing data.
LoadScans
Load the image and image parameters from a file path.
Parameters
img_paths : list[str, Path] Path to a valid AFM scan to load. channel : str Image channel to extract from the scan. extract : str What to extract from ''.topostats'' files, default is ''all'' which loads everything but if using in ''run_topostats'' functions then specific subsets of data are required and this allows just those to be loaded. Options include ''raw'' and ''filter'' at present.
Source code in topostats/io.py
589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 | |
__init__(img_paths, channel, extract='all')
Initialise the class.
Parameters
img_paths : list[str | Path] Path to a valid AFM scan to load. channel : str Image channel to extract from the scan. extract : str What to extract from ''.topostats'' files, default is ''all'' which loads everything but if using in ''run_topostats'' functions then specific subsets of data are required and this allows just those to be loaded. Options include ''raw'' and ''filter'' at present.
Source code in topostats/io.py
_check_image_size_and_add_to_dict(image, filename)
Check the image is above a minimum size in both dimensions.
Images that do not meet the minimum size are not included for processing.
Parameters
image : npt.NDArray An array of the extracted AFM image. filename : str The name of the file.
Source code in topostats/io.py
add_to_dict(image, filename)
Add an image and metadata to the img_dict dictionary under the key filename.
Adds the image and associated metadata such as any grain masks, and pixel to nanometere scaling factor to the img_dict dictionary which is used as a place to store the image information for processing.
Parameters
image : npt.NDArray An array of the extracted AFM image. filename : str The name of the file.
Source code in topostats/io.py
clean_dict(img_dict)
If we are loading .topostats files for reprocessing we already have the dictionary structure.
We therefore need to extract just the information that is required for the stage requested and remove everything else.
Parameters
img_dict : dict[str, Any] Original image dictionary from which data is to be extracted.
Returns
dict[str, Any] Returns the image dictionary with keys/values removed appropriate to the extraction stage.
Source code in topostats/io.py
get_data()
Extract image, filepath and pixel to nm scaling value, and append these to the img_dic object.
Source code in topostats/io.py
load_asd()
Extract image and pixel to nm scaling from .asd files.
Returns
tuple[npt.NDArray, float] A tuple containing the image and its pixel to nanometre scaling value.
Source code in topostats/io.py
load_gwy()
Extract image and pixel to nm scaling from the Gwyddion .gwy file.
Returns
tuple[npt.NDArray, float] A tuple containing the image and its pixel to nanometre scaling value.
Source code in topostats/io.py
load_ibw()
Load image from Asylum Research (Igor) .ibw files.
Returns
tuple[npt.NDArray, float] A tuple containing the image and its pixel to nanometre scaling value.
Source code in topostats/io.py
load_jpk()
Load image from JPK Instruments .jpk files.
Returns
tuple[npt.NDArray, float] A tuple containing the image and its pixel to nanometre scaling value.
Source code in topostats/io.py
load_spm()
Extract image and pixel to nm scaling from the Bruker .spm file.
Returns
tuple[npt.NDArray, float] A tuple containing the image and its pixel to nanometre scaling value.
Source code in topostats/io.py
load_topostats(extract='all')
Load a .topostats file (hdf5 format).
Loads and extracts the image, pixel to nanometre scaling factor and any grain masks.
Note that grain masks are stored via self.grain_masks rather than returned due to how we extract information for all other file loading functions.
Parameters
extract : str
String of which image (Numpy array) and data to extract, default is 'all' which returns the cleaned
(post-Filter) image, pixel_to_nm_scaling and all data. It is possible to extract image arrays for other
stages of processing such as raw or 'filter'.
Returns
dict[str, Any] | tuple[npt.NDArray, float, Any] A dictionary of all previously processed data or tuple containing the image and its pixel to nanometre scaling value. This is contingent on the ''extract'' option.
Source code in topostats/io.py
convert_basename_to_relative_paths(df)
Convert paths in the 'basename' column of a dataframe to relative paths.
If the 'basename' column has the following paths: ['/usr/topo/data/a/b', '/usr/topo/data/c/d'], the output will be: ['a/b', 'c/d'].
Parameters
df : pd.DataFrame A pandas dataframe containing a column 'basename' which contains the paths indicating the locations of the image data files.
Returns
pd.DataFrame A pandas dataframe where the 'basename' column has paths relative to a common parent.
Source code in topostats/io.py
dict_almost_equal(dict1, dict2, abs_tol=1e-09)
Recursively check if two dictionaries are almost equal with a given absolute tolerance.
Parameters
dict1 : dict First dictionary to compare. dict2 : dict Second dictionary to compare. abs_tol : float Absolute tolerance to check for equality.
Returns
bool True if the dictionaries are almost equal, False otherwise.
Source code in topostats/io.py
dict_to_hdf5(open_hdf5_file, group_path, dictionary)
Recursively save a dictionary to an open hdf5 file.
Parameters
open_hdf5_file : h5py.File An open hdf5 file object. group_path : str The path to the group in the hdf5 file to start saving data from. dictionary : dict A dictionary of the data to save.
Source code in topostats/io.py
dict_to_json(data, output_dir, filename, indent=4)
Write a dictionary to a JSON file at the specified location with the given name.
The NumpyEncoder class is used as the default encoder to ensure Numpy dtypes are written as strings (they are
not serialisable to JSON using the default JSONEncoder).
Parameters
data : dict Data as a dictionary that is to be written to file. output_dir : str | Path Directory the file is to be written to. filename : str | Path Name of output file. indent : int Spaces to indent JSON with, default is 4.
Source code in topostats/io.py
find_files(base_dir=None, file_ext='.spm')
Recursively scan the specified directory for images with the given file extension.
Parameters
base_dir : Union[str, Path] Directory to recursively search for files, if not specified the current directory is scanned. file_ext : str File extension to search for.
Returns
List List of files found with the extension in the given directory.
Source code in topostats/io.py
get_date_time()
Get a date and time for adding to generated files or logging.
Returns
str A string of the current date and time, formatted appropriately.
Source code in topostats/io.py
get_out_path(image_path=None, base_dir=None, output_dir=None)
Add the image path relative to the base directory to the output directory.
Parameters
image_path : Path The path of the current image. base_dir : Path Directory to recursively search for files. output_dir : Path The output directory specified in the configuration file.
Returns
Path The output path that mirrors the input path structure.
Source code in topostats/io.py
get_relative_paths(paths)
Extract a list of relative paths, removing the common suffix.
From a list of paths, create a list where each path is relative to all path's closest common parent. For example, ['a/b/c', 'a/b/d', 'a/b/e/f'] would return ['c', 'd', 'e/f'].
Parameters
paths : list List of string or pathlib paths.
Returns
list List of string paths, relative to the common parent.
Source code in topostats/io.py
hdf5_to_dict(open_hdf5_file, group_path)
Read a dictionary from an open hdf5 file.
Parameters
open_hdf5_file : h5py.File An open hdf5 file object. group_path : str The path to the group in the hdf5 file to start reading data from.
Returns
dict A dictionary of the hdf5 file data.
Source code in topostats/io.py
load_array(array_path)
Load a Numpy array from file.
Should have been saved using save_array() or numpy.save().
Parameters
array_path : Union[str, Path] Path to the Numpy array on disk.
Returns
npt.NDArray Returns the loaded Numpy array.
Source code in topostats/io.py
load_pkl(infile)
Load data from a pickle.
Parameters
infile : Path Path to a valid pickle.
Returns
dict: Dictionary of generated images.
Examples
from pathlib import Path from topostats.io import load_plots
pkl_path = "output/distribution_plots.pkl" my_plots = load_pkl(pkl_path)
Show the type of my_plots which is a dictionary of nested dictionaries
type(my_plots)
Show the keys are various levels of nesting.
my_plots.keys() my_plots["area"].keys() my_plots["area"]["dist"].keys()
Get the figure and axis object for a given metrics distribution plot
figure, axis = my_plots["area"]["dist"].values()
Get the figure and axis object for a given metrics violin plot
figure, axis = my_plots["area"]["violin"].values()
Source code in topostats/io.py
merge_mappings(map1, map2)
Merge two mappings (dictionaries), with priority given to the second mapping.
Note: Using a Mapping should make this robust to any mapping type, not just dictionaries. MutableMapping was needed as Mapping is not a mutable type, and this function needs to be able to change the dictionaries.
Parameters
map1 : MutableMapping First mapping to merge, with secondary priority. map2 : MutableMapping Second mapping to merge, with primary priority.
Returns
dict Merged dictionary.
Source code in topostats/io.py
path_to_str(config)
Recursively traverse a dictionary and convert any Path() objects to strings for writing to YAML.
Parameters
config : dict Dictionary to be converted.
Returns
Dict: The same dictionary with any Path() objects converted to string.
Source code in topostats/io.py
read_64d(open_file)
Read a 64-bit double from an open binary file.
Parameters
open_file : io.TextIOWrapper An open file object.
Returns
float Python float type cast from the double.
Source code in topostats/io.py
read_char(open_file)
Read a character from an open binary file.
Parameters
open_file : io.TextIOWrapper An open file object.
Returns
str A string type cast from the decoded character.
Source code in topostats/io.py
read_gwy_component_dtype(open_file)
Read the data type of a .gwy file component.
Possible data types are as follows:
- 'b': boolean
- 'c': character
- 'i': 32-bit integer
- 'q': 64-bit integer
- 'd': double
- 's': string
- 'o':
.gwyformat object
Capitalised versions of some of these data types represent arrays of values of that data type. Arrays are stored as an unsigned 32 bit integer, describing the size of the array, followed by the unseparated array values:
- 'C': array of characters
- 'I': array of 32-bit integers
- 'Q': array of 64-bit integers
- 'D': array of doubles
- 'S': array of strings
- 'O': array of objects.
Parameters
open_file : io.TextIOWrapper An open file object.
Returns
str Python string (one character long) of the data type of the component's value.
Source code in topostats/io.py
read_null_terminated_string(open_file, encoding='utf-8')
Read an open file from the current position in the open binary file, until the next null value.
Parameters
open_file : io.TextIOWrapper An open file object. encoding : str Encoding to use when decoding the bytes.
Returns
str String of the ASCII decoded bytes before the next null byte.
Examples
with open("test.txt", "rb") as f: ... print(read_null_terminated_string(f), encoding="utf-8")
Source code in topostats/io.py
read_u32i(open_file)
Read an unsigned 32 bit integer from an open binary file (in little-endian form).
Parameters
open_file : io.TextIOWrapper An open file object.
Returns
int Python integer type cast from the unsigned 32 bit integer.
Source code in topostats/io.py
read_yaml(filename)
Read a YAML file.
Parameters
filename : Union[str, Path] YAML file to read.
Returns
Dict Dictionary of the file.
Source code in topostats/io.py
save_array(array, outpath, filename, array_type)
Save a Numpy array to disk.
Parameters
array : npt.NDArray Numpy array to be saved. outpath : Path Location array should be saved. filename : str Filename of the current image from which the array is derived. array_type : str Short string describing the array type e.g. z_threshold. Ideally should not have periods or spaces in (use underscores '_' instead).
Source code in topostats/io.py
save_folder_grainstats(output_dir, base_dir, all_stats_df, stats_filename)
Save a data frame of grain and tracing statistics at the folder level.
Parameters
output_dir : Union[str, Path] Path of the output directory head. base_dir : Union[str, Path] Path of the base directory where files were found. all_stats_df : pd.DataFrame The dataframe containing all sample statistics run. stats_filename : str The name of the type of statistics dataframe to be saved.
Returns
None This only saves the dataframes and does not retain them.
Source code in topostats/io.py
save_pkl(outfile, to_pkl)
Pickle objects for working with later.
Parameters
outfile : Path Path and filename to save pickle to. to_pkl : dict Object to be picled.
Source code in topostats/io.py
save_topostats_file(output_dir, filename, topostats_object)
Save a topostats dictionary object to a .topostats (hdf5 format) file.
Parameters
output_dir : Path Directory to save the .topostats file in. filename : str File name of the .topostats file. topostats_object : dict Dictionary of the topostats data to save. Must include a flattened image and pixel to nanometre scaling factor. May also include grain masks.
Source code in topostats/io.py
write_config_with_comments(args=None)
Write a sample configuration with in-line comments.
This function is not designed to be used interactively but can be, just call it without any arguments and it will write a configuration to './config.yaml'.
Parameters
args : Namespace A Namespace object parsed from argparse with values for 'filename'.
Source code in topostats/io.py
write_yaml(config, output_dir, config_file='config.yaml', header_message=None)
Write a configuration (stored as a dictionary) to a YAML file.
Parameters
config : dict Configuration dictionary. output_dir : Union[str, Path] Path to save the dictionary to as a YAML file (it will be called 'config.yaml'). config_file : str Filename to write to. header_message : str String to write to the header message of the YAML file.
Source code in topostats/io.py
handler: python options: docstring_style: numpy rendering: show_signature_annotations: true