lipd package¶
Module contents¶
-
lipd.addEnsemble(D, dsn, ensemble)¶ Create ensemble entry and then add it to the specified LiPD dataset.
Parameters: - D (dict) – LiPD data
- dsn (str) – Dataset name
- ensemble (list) – Nested numpy array of ensemble column data.
Return dict D: LiPD data
-
lipd.collapseTs(ts=None)¶ Collapse a time series back into LiPD record form.
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. New_D = lipd.collapseTs(ts)Parameters: ts (list) – Time series Return dict: Metadata
-
lipd.doi()¶ Update publication information using data DOIs. Updates LiPD files on disk, not in memory.
Example1: lipd.readLipd()2: lipd.doi()Return none:
-
lipd.ensToDf(ensemble)¶ Create an ensemble data frame from some given nested numpy arrays
Parameters: ensemble (list) – Ensemble data Return obj df: Pandas dataframe
-
lipd.excel()¶ Convert Excel files to LiPD files. LiPD data is returned directly from this function.
Example1: lipd.readExcel()2: D = lipd.excel()Return dict _d: Metadata
-
lipd.extractTs(d, chron=False)¶ Create a time series using LiPD data (uses paleoData by default)
Example : paleoData1. D = lipd.readLipd()2. ts = lipd.extractTs(D)Example : chronData1. D = lipd.readLipd()2. ts = lipd.extractTs(D, chron=True)Parameters: - d (dict) – Metadata
- chron (bool) – Create a chronData time series
Return list l: Time series
-
lipd.filterTs(ts, expression)¶ Create a new time series that only contains entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)new_ts = filterTs(ts, “archiveType == marine sediment”)new_ts = filterTs(ts, “paleoData_variableName == sst”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list new_ts: Filtered time series that matches the expression
-
lipd.getCsv(L=None)¶ Get CSV from LiPD metadata
Examplec = lipd.getCsv(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: CSV data
-
lipd.getLipdNames(D=None)¶ Get a list of all LiPD names in the library
Examplenames = lipd.getLipdNames(D)Return list f_list: File list
-
lipd.getMetadata(L)¶ Get metadata from a LiPD data in memory
Examplem = lipd.getMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: LiPD record (metadata only)
-
lipd.noaa(d=None)¶ Convert between NOAA and LiPD files
Example: LiPD to NOAA converter1: D = lipd.readLipd()2: lipd.noaa(D)Example: NOAA to LiPD converter1: readNoaa()2: lipd.noaa()Return none:
-
lipd.queryTs(ts, expression)¶ Find the indices of the time series entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)matches = queryTs(ts, “archiveType == marine sediment”)matches = queryTs(ts, “geo_meanElev <= 2000”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list _idx: Indices of entries that match the criteria
-
lipd.readAll(usr_path='')¶ Read all approved file types at once. Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.readExcel(usr_path='')¶ Read Excel file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.readLipd(usr_path='')¶ Read LiPD file(s). Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return dict _d: Metadata
-
lipd.readNoaa(usr_path='')¶ Read NOAA file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.run()¶ Initialize and start objects. This is called automatically when importing the package.
Return none:
-
lipd.showDfs(d)¶ Display the available data frame names in a given data frame collection
Parameters: d (dict) – Dataframe collection Return none:
-
lipd.showLipds(D=None)¶ Display the dataset names of a given LiPD data
Examplelipd.showLipds(D)Pararm dict D: LiPD data Return none:
-
lipd.showMetadata(dat)¶ Display the metadata specified LiPD in pretty print
ExampleshowMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: dat (dict) – Metadata Return none:
-
lipd.tsToDf(tso)¶ Create Pandas DataFrame from TimeSeries object. Use: Must first extractTs to get a time series. Then pick one item from time series and pass it through
Parameters: tso (dict) – Time series entry Return dict dfs: Pandas dataframes
-
lipd.viewTs(ts)¶ View the contents of one time series entry in a nicely formatted way
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. viewTs(ts[0])Parameters: ts (dict) – One time series entry Return none:
-
lipd.writeLipd(dat, usr_path='', filename='')¶ Write LiPD data to file(s)
Parameters: - dat (dict) – Metadata
- usr_path (str) – Destination (optional)
- filename (str) – LiPD filename, for writing one specific file (optional)
Return none:
Submodules¶
alternates¶
List of alternate and synonym keys
bag¶
-
lipd.bag.create_bag(dir_bag)¶ Create a Bag out of given files. :param str dir_bag: Directory that contains csv, jsonld, and changelog files. :return obj: Bag
-
lipd.bag.finish_bag(dir_bag)¶ Closing steps for creating a bag :param obj dir_bag: :return None:
-
lipd.bag.open_bag(dir_bag)¶ Open Bag at the given path :param str dir_bag: Path to Bag :return obj: Bag
-
lipd.bag.resolved_flag(bag)¶ Check DOI flag in bag.info to see if doi_resolver has been previously run :param obj bag: Bag :return bool: Flag
-
lipd.bag.validate_md5(bag)¶ Check if Bag is valid :param obj bag: Bag :return None:
blanks¶
List of empty and ignored keys
csvs¶
-
lipd.csvs.get_csv_from_metadata(name, metadata)¶ Two goals. Get all csv from metadata, and return new metadata with generated filenames to match files. :param str name: LiPD dataset name :param dict metadata: Metadata :return dict: Csv Data
-
lipd.csvs.merge_csv_metadata(d)¶ Using the given metadata dictionary, retrieve CSV data from CSV files, and insert the CSV values into their respective metadata columns. Checks for both paleoData and chronData tables. :param dict d: Metadata :return dict: Modified metadata dictionary
-
lipd.csvs.read_csv_from_file(filename)¶ Opens the target CSV file and creates a dictionary with one list for each CSV column. :param str filename: :return list of lists: column values
-
lipd.csvs.write_csv_to_file(d)¶ Writes columns of data to a target CSV file. :param dict d: A dictionary containing one list for every data column. Keys: int, Values: list :return None:
dataframes¶
-
lipd.dataframes.create_dataframe(ensemble)¶ Create a data frame from given nested lists of ensemble data :param list ensemble: Ensemble data :return obj: Dataframe
-
lipd.dataframes.get_filtered_dfs(lib, expr)¶ Main: Get all data frames that match the given expression :return dict: Filenames and data frames (filtered)
-
lipd.dataframes.lipd_to_df(metadata, csvs)¶ Create an organized collection of data frames from LiPD data :param dict metadata: LiPD data :param dict csvs: Csv data :return dict: One data frame per table, organized in a dictionary by name
-
lipd.dataframes.ts_to_df(metadata)¶ Create a data frame from one TimeSeries object :param dict metadata: Time Series dictionary :return dict: One data frame per table, organized in a dictionary by name
directory¶
-
lipd.directory.browse_dialog_dir()¶ Open up a GUI browse dialog window and let to user pick a target directory. :return str: Target directory path
-
lipd.directory.browse_dialog_file()¶ Open up a GUI browse dialog window and let to user select one or more files :return str _path: Target directory path :return list _files: List of selected files
-
lipd.directory.check_file_age(filename, days)¶ Check if the target file has an older creation date than X amount of time. i.e. One day: 60*60*24 :param str filename: Target filename :param int days: Limit in number of days :return bool: True - older than X time, False - not older than X time
-
lipd.directory.collect_metadata_file(full_path)¶ Create the file metadata and add it to the appropriate section by file-type :param str full_path: :param dict existing_files: :return dict existing files:
-
lipd.directory.collect_metadata_files(cwd, new_files, existing_files)¶ Collect all files from a given path. Separate by file type, and return one list for each type If ‘files’ contains specific :param str cwd: Directory w/ target files :param list new_files: Specific new files to load :param dict existing_files: Files currently loaded, separated by type :return list: All files separated by type
-
lipd.directory.create_tmp_dir()¶ Creates tmp directory in OS temp space. :return str: Path to tmp directory
-
lipd.directory.dir_cleanup(dir_bag, dir_data)¶ Moves JSON and csv files to bag root, then deletes all the metadata bag files. We’ll be creating a new bag with the data files, so we don’t need the other text files and such. :param str dir_bag: Path to root of Bag :param str dir_data: Path to Bag /data subdirectory :return None:
-
lipd.directory.filename_from_path(path)¶ Extract the file name from a given file path. :param str path: File path :return str: File name with extension
-
lipd.directory.find_files()¶ Search for the directory containing jsonld and csv files. chdir and then quit. :return none:
-
lipd.directory.get_filenames_generated(d, name='', csvs='')¶ Get the filenames that the LiPD utilities has generated (per naming standard), as opposed to the filenames that originated in the LiPD file (that possibly don’t follow the naming standard) :param dict d: Data :param str name: LiPD dataset name to prefix :param list csvs: Filenames list to merge with :return list: Filenames
-
lipd.directory.get_filenames_in_lipd(path, name='')¶ List all the files contained in the LiPD archive. Bagit, JSON, and CSV :param str path: Directory to be listed :param str name: LiPD dataset name, if you want to prefix it to show file hierarchy :return list: Filenames found
-
lipd.directory.get_src_or_dst(mode, path_type)¶ User sets the path to a LiPD source location :param str mode: “read” or “write” mode :param str path_type: “directory” or “file” :return str path: dir path to files :return list files: files chosen
-
lipd.directory.get_src_or_dst_path(prompt, count)¶ Let the user choose a path, and store the value. :return str _path: Target directory :return str count: Counter for attempted prompts
-
lipd.directory.get_src_or_dst_prompt(mode)¶ String together the proper prompt based on the mode :param str mode: “read” or “write” :return str prompt: The prompt needed
-
lipd.directory.list_files(x, path='')¶ Lists file(s) in given path of the X type. :param str x: File extension that we are interested in. :param str path: Path, if user would like to check a specific directory outside of the CWD :return list of str: File name(s) to be worked on
-
lipd.directory.rm_file_if_exists(path, filename)¶ Remove a file if it exists. Useful for when we want to write a file, but it already exists in that locaiton. :param str filename: Filename :param str path: Directory :return none:
-
lipd.directory.rm_files_in_dir(path)¶ Removes all files within a directory, but does not delete the directory :param str path: Target directory :return none:
doi_main¶
-
lipd.doi_main.doi_main(files)¶ Main function that controls the script. Take in directory containing the .lpd file(s). Loop for each file. :return None:
-
lipd.doi_main.process_lpd(name, dir_tmp)¶ Opens up json file, invokes doi_resolver, closes file, updates changelog, cleans directory, and makes new bag. :param str name: Name of current .lpd file :param str dir_tmp: Path to tmp directory :return none:
-
lipd.doi_main.prompt_force()¶ Ask the user if they want to force update files that were previously resolved :return bool: response
doi_resolver¶
-
class
lipd.doi_resolver.DOIResolver(dir_root, name, root_dict)¶ Bases:
objectUse DOI id(s) to pull updated publication info from doi.org and overwrite file data.
Input: Original publication dictionary Output: Updated publication dictionary (success), original publication dictionary (fail)
-
static
compare_replace(pub_dict, fetch_dict)¶ Take in our Original Pub, and Fetched Pub. For each Fetched entry that has data, overwrite the Original entry :param pub_dict: (dict) Original pub dictionary :param fetch_dict: (dict) Fetched pub dictionary from doi.org :return: (dict) Updated pub dictionary, with fetched data taking precedence
Compiles authors “Last, First” into a single list :param list authors: Raw author data retrieved from doi.org :return list: Author objects
-
static
compile_date(date_parts)¶ Compiles date only using the year :param list date_parts: List of date parts retrieved from doi.org :return str: Date string or NaN
-
compile_fetch(raw, doi_id)¶ Loop over Raw and add selected items to Fetch with proper formatting :param dict raw: JSON data from doi.org :param str doi_id: :return dict:
-
find_doi(curr_dict)¶ Recursively search the file for the DOI id. More taxing, but more flexible when dictionary structuring isn’t absolute :param dict curr_dict: Current dictionary being searched :return dict bool: Recursive - Current dictionary, False flag that DOI was not found :return str bool: Final - DOI id, True flag that DOI was found
-
get_data(doi_id, idx)¶ Resolve DOI and compile all attributes into one dictionary :param str doi_id: :param int idx: Publication index :return dict: Updated publication dictionary
-
illegal_doi(doi_string)¶ DOI string did not match the regex. Determine what the data is. :param doi_string: (str) Malformed DOI string :return: None
-
main()¶ Main function that gets file(s), creates outputs, and runs all operations. :return dict: Updated or original data for jsonld file
-
noaa_citation(doi_string)¶ Special instructions for moving noaa data to the correct fields :param doi_string: (str) NOAA url :return: None
-
remove_empties(pub)¶
-
static
ensembles¶
-
lipd.ensembles.create_ensemble(ensemble)¶ Add ensemble data to a LiPD object :param list ensemble: Ensemble data nested lists :return dict: Structured Ensemble data
-
lipd.ensembles.insert_ensemble(d, ens)¶ Insert the ensemble table dictionary into the LiPD metadata :param dict d: LiPD metadata :param dict ens: Ensemble data to insert :return dict:
excel¶
-
lipd.excel.cells_dn_meta(workbook, sheet, row, col, final_dict)¶ Traverse all cells in a column moving downward. Primarily created for the metadata sheet, but may use elsewhere. Check the cell title, and switch it to. :param obj workbook: :param str sheet: :param int row: :param int col: :param dict final_dict: :return: none
-
lipd.excel.cells_rt_meta(workbook, sheet, row, col)¶ Traverse all cells in a row. If you find new data in a cell, add it to the list. :param obj workbook: :param str sheet: :param int row: :param int col: :return list: Cell data for a specific row
-
lipd.excel.cells_rt_meta_pub(workbook, sheet, row, col, pub_qty)¶ Publication section is special. It’s possible there’s more than one publication. :param obj workbook: :param str sheet: :param int row: :param int col: :param int pub_qty: Number of distinct publication sections in this file :return list: Cell data for a specific row
Split the string of author names into the BibJSON format. :param str cell: Data from author cell :return: (list of dicts) Author names
-
lipd.excel.compile_fund(workbook, sheet, row, col)¶ Compile funding entries. Iter both rows at the same time. Keep adding entries until both cells are empty. :param obj workbook: :param str sheet: :param int row: :param int col: :return list of dict: l
-
lipd.excel.compile_geo(d)¶ Compile top-level Geography dictionary. :param d: :return:
-
lipd.excel.compile_geometry(lat, lon, elev)¶ Take in lists of lat and lon coordinates, and determine what geometry to create :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.compile_temp(d, key, value)¶ Compiles temporary dictionaries for metadata. Adds a new entry to an existing dictionary. :param dict d: :param str key: :param any value: :return dict:
-
lipd.excel.count_chron_variables(temp_sheet)¶ Count the number of chron variables :param obj temp_sheet: :return int: variable count
-
lipd.excel.excel_main(file)¶ Parse data from Excel spreadsheets into LiPD files. :return list: Filenames of LiPD files created
-
lipd.excel.extract_short(string_in)¶ Extract the short name from a string that also has units. :param str string_in: :return str:
-
lipd.excel.extract_units(string_in)¶ Extract units from parenthesis in a string. i.e. “elevation (meters)” :param str string_in: :return str:
-
lipd.excel.geometry_linestring(lat, lon, elev)¶ GeoJSON Linestring. Latitude and Longitude have 2 values each. :param list lat: Latitude values :param list lon: Longitude values :return dict:
-
lipd.excel.geometry_point(lat, lon, elev)¶ GeoJSON point. Latitude and Longitude only have one value each :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.geometry_range(crd_range, elev, crd_type)¶ Range of coordinates. (e.g. 2 latitude coordinates, and 0 longitude coordinates) :param crd_range: Latitude or Longitude values :param elev: Elevation value :param crd_type: Coordinate type, lat or lon :return dict:
-
lipd.excel.get_chron_data(temp_sheet, row, total_vars)¶ Capture all data in for a specific chron data row (for csv output) :param obj temp_sheet: :param int row: :param int total_vars: :return list: data_row
-
lipd.excel.get_chron_var(temp_sheet, start_row)¶ Capture all the vars in the chron sheet (for json-ld output) :param obj temp_sheet: :param int start_row: :return: (list of dict) column data
-
lipd.excel.instance_str(cell)¶ Match data type and return string :param any cell: :return str:
-
lipd.excel.logger_excel= <logging.Logger object>¶ VERSION: LiPD v1.2
-
lipd.excel.name_to_jsonld(title_in)¶ Convert formal titles to camelcase json_ld text that matches our context file Keep a growing list of all titles that are being used in the json_ld context :param str title_in: :return str:
-
lipd.excel.traverse_to_chron_data(temp_sheet)¶ Traverse down to the first row that has chron data :param obj temp_sheet: :return int: traverse_row
-
lipd.excel.traverse_to_chron_var(temp_sheet)¶ Traverse down to the row that has the first variable :param obj temp_sheet: :return int:
inferred_data¶
-
lipd.inferred_data.get_inferred_data_table(pc, table)¶ Table level: Dive down, calculate data, then return the new table with the inferred data. :param str pc: Paleo or Chron table type :param dict table: Table data :return dict table: Table with new data
io¶
jsons¶
-
lipd.jsons.get_csv_from_json(d)¶ Get CSV values when mixed into json data. Pull out the CSV data and put it into a dictionary. :param dict d: JSON with CSV values :return dict: CSV values. (i.e. { CSVFilename1: { Column1: [Values], Column2: [Values] }, CSVFilename2: … }
-
lipd.jsons.idx_name_to_num(d)¶ Switch from index-by-name to index-by-number. :param dict d: Metadata :return dict: Modified metadata
-
lipd.jsons.idx_num_to_name(d)¶ Switch from index-by-number to index-by-name. :param dict d: Metadata :return dict: Modified Metadata
-
lipd.jsons.read_json_from_file(filename)¶ Import the JSON data from target file. :param str filename: Target File :return dict: JSON data
-
lipd.jsons.read_jsonld()¶ Find jsonld file in the cwd (or within a 2 levels below cwd), and load it in. :return dict: Jsonld data
-
lipd.jsons.remove_csv_from_json(d)¶ Remove all CSV data ‘values’ entries from paleoData table in the JSON structure. :param dict d: JSON data - old structure :return dict: Metadata dictionary without CSV values
-
lipd.jsons.write_json_to_file(json_data, filename='metadata')¶ Write all JSON in python dictionary to a new json file. :param dict json_data: JSON data :param str filename: Target filename (defaults to ‘metadata.jsonld’) :return None:
loggers¶
-
lipd.loggers.create_benchmark(name, log_file, level=20)¶ Creates a logger for function benchmark times :param str name: Name of the logger :param str log_file: Filename :return obj: Logger
-
lipd.loggers.create_logger(name)¶ Creates a logger with the below attributes. :param str name: Name of the logger :return obj: Logger
-
lipd.loggers.log_benchmark(fn, start, end)¶ Log a given function and how long the function takes in seconds :param str fn: Function name :param float start: Function start time :param float end: Function end time :return none:
-
lipd.loggers.update_changelog()¶ Create or update the changelog txt file. Prompt for update description. :return None:
lpd_noaa¶
-
class
lipd.lpd_noaa.LPD_NOAA(dir_root, name, lipd_dict)¶ Bases:
objectCreates a NOAA object that contains all the functions needed to write out a LiPD file as a NOAA text file. Supports LiPD Version: v1.2 NOAA txt template: v3.0
Return none: Writes NOAA text to file in local storage -
get_master()¶ Get the master json that has been modified :return dict: self.lipd_data
-
get_wdc_paleo_url()¶ When a NOAA file is created, it creates a URL link to where the dataset will be hosted in NOAA’s archive Retrieve and add this link to the original LiPD file, so we can trace the dataset to NOAA. :return str:
-
main()¶ Load in the template file, and run through the parser :return none:
-
misc¶
-
lipd.misc.cast_float(x)¶ Attempt to cleanup string or convert to number value. :param any x: :return float:
-
lipd.misc.cast_int(x)¶ Cast unknown type into integer :param any x: :return int:
-
lipd.misc.cast_values_csvs(d, idx, x)¶ Attempt to cast string to float. If error, keep as a string. :param dict d: Data :param int idx: Index number :param str x: Data :return any:
-
lipd.misc.check_dsn(name, _json)¶ Get a dataSetName. If one is not provided, then insert the filename as the dataSetName. :param str name: Filename w/o extension :param dict _json: Metadata :return dict _json: Metadata
-
lipd.misc.clean_doi(doi_string)¶ Use regex to extract all DOI ids from string (i.e. 10.1029/2005pa001215) :param str doi_string: Raw DOI string value from input file. Often not properly formatted. :return list: DOI ids. May contain 0, 1, or multiple ids.
-
lipd.misc.fix_coordinate_decimal(d)¶ Coordinate decimal degrees calculated by an excel formula are often too long as a repeating decimal. Round them down to 5 decimals :param dict d: Metadata :return dict d: Metadata
-
lipd.misc.generate_timestamp(fmt=None)¶ Generate a timestamp to mark when this file was last modified. :param str fmt: Special format instructions :return str: YYYY-MM-DD format, or specified format
-
lipd.misc.generate_tsid(size=8)¶ Generate a TSid string. Use the “PYT” prefix for traceability, and 8 trailing generated characters ex: PYT9AG234GS :return:
-
lipd.misc.get_appended_name(name, columns)¶ Append numbers to a name until it no longer conflicts with the other names in a column. Necessary to avoid overwriting columns and losing data. Loop a preset amount of times to avoid an infinite loop. There shouldn’t ever be more than two or three identical variable names in a table. :param str name: Variable name in question :param dict columns: Columns listed by variable name :return str: Appended variable name
Take author or investigator data, and convert it to a concatenated string of names. Author data structure has a few variations, so account for all. :param any x: Author data :return str: Author string
-
lipd.misc.get_dsn(d)¶ Get the dataset name from a record :param dict d: Metadata :return str: Dataset name
-
lipd.misc.get_ensemble_counts(d)¶ Determine if this is a 1 or 2 column ensemble. Then determine how many columns and rows it has. :param d: :return:
-
lipd.misc.get_missing_value_key(d)¶ Get the Missing Value entry from a table of data. If none is found, try the columns. If still none found, prompt user. :param dict d: Table of data :return str: Missing Value
-
lipd.misc.get_table_key(key, d, fallback='')¶ Try to get a table name from a data table :param str key: Key to try first :param dict d: Data table :param str fallback: (optional) If we don’t find a table name, use this as a generic name fallback. :return str: Data table name
-
lipd.misc.get_variable_name_col(d)¶ Get the variable name from a table or column :param dict d: Metadata :return str:
-
lipd.misc.is_ensemble(d)¶ Check if a table of data is an ensemble table. Is the first values index a list? ensemble. Int/float? not ensemble. :param dict d: Table data :return bool: Ensemble or not ensemble
-
lipd.misc.load_fn_matches_ext(file_path, file_type)¶ Check that the file extension matches the target extension given. :param str file_path: Path to be checked :param str file_type: Target extension :return bool:
-
lipd.misc.match_arr_lengths(l)¶ Check that all the array lengths match so that a DataFrame can be created successfully. :param list l: Nested arrays :return bool: Valid or invalid
-
lipd.misc.match_operators(inp, relate, cut)¶ Compare two items. Match a string operator to an operator function :param str inp: Comparison item :param str relate: Comparison operator :param any cut: Comparison item :return bool: Comparison truth
-
lipd.misc.mv_files(src, dst)¶ Move all files from one directory to another :param str src: Source directory :param str dst: Destination directory :return none:
-
lipd.misc.normalize_name(s)¶ Remove foreign accents and characters to normalize the string. Prevents encoding errors. :param str s: :return str:
-
lipd.misc.path_type(path, target)¶ Determine if given path is file, directory, or other. Compare with target to see if it’s the type we wanted. :param str path: Path :param str target: Target type wanted :return bool:
-
lipd.misc.prompt_protocol()¶ Prompt user if they would like to save pickle file as a dictionary or an object. :return str: Answer
-
lipd.misc.put_tsids(x)¶ Recursively add in TSids into any columns that do not have them. Look for “columns” keys, and then start looping and adding generated TSids to each column :param any x: Recursive, so could be any data type. :return any x: Recursive, so could be any data type.
-
lipd.misc.rm_empty_doi(d)¶ If an “identifier” dictionary has no doi ID, then it has no use. Delete it. :param dict d: JSON Metadata :return dict: JSON Metadata
-
lipd.misc.rm_empty_fields(x)¶ Go through N number of nested data types and remove all empty entries. Recursion :param any x: Dictionary, List, or String of data :return any: Returns a same data type as original, but without empties.
-
lipd.misc.rm_files(path, extension)¶ Remove all files in the given directory with the given extension :param str path: Directory :param str extension: File type to remove :return none:
-
lipd.misc.rm_keys_from_dict(d, keys)¶ Given a dictionary and a key list, remove any data in the dictionary with the given keys. :param dict d: Data :param list keys: List of key data to remove :return dict d: Data (with keys + data removed)
-
lipd.misc.rm_missing_values_table(d)¶ Loop for each table column and remove the missingValue key & data :param dict d: Table data :return dict d: Table data
-
lipd.misc.rm_values_fields(x)¶ (recursive) Remove all “values” fields from the metadata :param any x: Any data type :return dict: metadata without “values”
-
lipd.misc.split_path_and_file(s)¶ Given a full path to a file, split and return a path and filename :param str s: Full path :return str str: Path, filename
-
lipd.misc.unwrap_arrays(l)¶ Unwrap nested lists to be one “flat” list of lists. Mainly for prepping ensemble data for DataFrame() creation :param list l: Nested lists :return list: Flattened lists
noaa¶
-
lipd.noaa.lpd_to_noaa(obj)¶ Convert a LiPD format to NOAA format :param obj obj: LiPD object :return obj: LiPD object (modified)
-
lipd.noaa.noaa_prompt()¶ Convert between NOAA and LiPD file formats. :return:
-
lipd.noaa.noaa_to_lpd(files)¶ Convert NOAA format to LiPD format :param dict files: Files metadata :return None:
noaa_lpd¶
regexes¶
timeseries¶
-
lipd.timeseries.collapse(l)¶ LiPD Version 1.3 Main function to initiate time series to LiPD conversion :param list l: Time series :return dict _master: LiPD data, sorted by dataset name
-
lipd.timeseries.extract(d, chron)¶ LiPD Version 1.3 Main function to initiate LiPD to TSOs conversion. :param dict d: Metadata for one LiPD file :param bool chron: Paleo mode (default) or Chron mode :return list _ts: Time series
-
lipd.timeseries.get_matches(expr_lst, ts)¶ Get a list of TimeSeries objects that match the given expression. :param list expr_lst: Expression :param list ts: TimeSeries :return list new_ts: Matched time series objects :return list idxs: Indices of matched objects
-
lipd.timeseries.mode_ts(ec, ts=None, b=None)¶ Get string for the mode :param bool b: Chron boolean (for extract) :param str ec: extract or collapse :param list ts: Time series (for collapse) :return str phrase: Phrase
-
lipd.timeseries.translate_expression(expression)¶ Check if the expression is valid, then check turn it into an expression that can be used for filtering. :return list of lists: One or more matches. Each list has 3 strings.
validator_api¶
-
lipd.validator_api.create_detailed_results(data)¶
-
lipd.validator_api.display_results(data, detailed=False)¶ Display the results from the validator in a brief or detailed way. :param dict data: Results, sorted by dataset name :param bool detailed: Detailed results on or off :return none:
-
lipd.validator_api.get_validator_format(data_json, data_csv, filenames)¶ Format the LIPD data in the layout that the Lipd.net validator accepts. Example of one _file metadata. _file_list will contain 1 or more _file’s _file = {
“type”: “bagit/json/csv”, “filenameFull”: /path/to/filename.txt, “filenameShort”: filename.txt, “data”: “”, “pretty”: “”}
Parameters: - data_json (dict) – Metadata
- data_csv (dict) – CSV data
- filenames (list) – All files found in LiPD archive
Return list: Validator-formatted data
-
lipd.validator_api.get_validator_results(data)¶ Send LiPD data to the Lipd.net validator and get the results back. :param data: :return:
versions¶
-
lipd.versions.get_lipd_version(d)¶ Check what version of LiPD this file is using. If none is found, assume it’s using version 1.0 :param dict d: Metadata :return float:
-
lipd.versions.update_lipd_v1_1(d)¶ Update LiPD v1.0 to v1.1 - chronData entry is a list that allows multiple tables - paleoData entry is a list that allows multiple tables - chronData now allows measurement, model, summary, modelTable, ensemble, calibratedAges tables - Added ‘lipdVersion’ key
Parameters: d (dict) – Metadata v1.0 Return dict d: Metadata v1.1
-
lipd.versions.update_lipd_v1_2(d)¶ Update LiPD v1.1 to v1.2 - Added NOAA compatible keys : maxYear, minYear, originalDataURL, WDCPaleoURL, etc - ‘calibratedAges’ key is now ‘distribution’ - paleoData structure mirrors chronData. Allows measurement, model, summary, modelTable, ensemble,
distribution tablesParameters: d (dict) – Metadata v1.1 Return dict d: Metadata v1.2
-
lipd.versions.update_lipd_v1_3(d)¶ Update LiPD v1.2 to v1.3 - Added ‘createdBy’ key - Top-level folder inside LiPD archives are named “bag”. (No longer <datasetname>) - .jsonld file is now generically named ‘metadata.jsonld’ (No longer <datasetname>.lpd ) - All “paleo” and “chron” prefixes are removed from “paleoMeasurementTable”, “paleoModel”, etc. - Merge isotopeInterpretation and climateInterpretation into “interpretation” block - ensemble table entry is a list that allows multiple tables - summary table entry is a list that allows multiple tables :param dict d: Metadata v1.2 :return dict d: Metadata v1.3
-
lipd.versions.update_lipd_v1_3_names(d)¶ Update the key names and merge interpretation data :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.update_lipd_v1_3_structure(d)¶ Update the structure for summary and ensemble tables :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.update_lipd_version(d)¶ Metadata is indexed by number at this step.
Use the current version number to determine where to start updating from. Use “chain versioning” to make it modular. If a file is a few versions behind, convert to EACH version until reaching current. If a file is one version behind, it will only convert once to the newest. :param dict d: Metadata :return dict d: Metadata
zips¶
-
lipd.zips.unzipper(filename, dir_tmp)¶ Unzip .lpd file contents to tmp directory. :param str filename: filename.lpd :param str dir_tmp: Tmp folder to extract contents to :return None:
-
lipd.zips.zipper(root_dir='', name='', path_name_ext='')¶ Zips up directory back to the original location :param str root_dir: Root directory of the archive :param str name: <datasetname>.lpd :param str path_name_ext: /path/to/filename.lpd