snowex_db.string_management module#
Module for functions that interpret various strings encountered in files. These functions either prep, strip, or interpret strings for headers or the actual data to be uploaded.
- snowex_db.string_management.clean_str(messy)[source]#
Removes unwanted character in a str that we encounter alot
- snowex_db.string_management.get_alpha_ratio(str_line, encapsulator='""')[source]#
Calculates the ratio of characters to numbers and potentially ignore things encapsulated
- Args:
str_line: String to evaluate encapsulator: chars that encapsulate strings to be ignored
- Returns:
ratio: float ratio of number of letter to number of numbers
- snowex_db.string_management.get_encapsulated(str_line, encapsulator)[source]#
Returns items found in the encapsulator, useful for finding units
- Args:
str_line: String that has encapusulated info we want removed encapsulator: string of characters encapusulating info to be removed
- Returns:
result: list of strings found inside anything between encapsulators
- e.g.
line = ‘density (kg/m^3), temperature (C)’ [‘kg/m^3’, ‘C’] = get_encapsulated(line, ‘()’)
- snowex_db.string_management.kw_in_here(kw, d, case_sensitive=True)[source]#
Determines if the keyword is found in any of the entries in the List If any match is found returns true
Can use a list or dictionary. If a dictionary is supplied the keys will be used
e.g.
dielectric_constant is found in [temperature, dielectric_constant_a]
- Args:
kw: Keyword we’re searching for d: List or dictionary with keys of strings case_sensitive: Boolean indicating whether it should be case sensitive
or not
- Returns:
Bool: Indicating the keyword was found
- snowex_db.string_management.line_is_header(str_line, header_sep=',', header_indicator='#', previous_alpha_ratio=None, expected_columns=None)[source]#
Determine is line 1 is a header line
- snowex_db.string_management.parse_none(value)[source]#
parses values looking for NANs, Nones, etc…
- Args:
value: Value potentially containing a none or nan
- Returns:
- result: If string value is nan or none, then return None type otherwise
return original value
- snowex_db.string_management.remap_data_names(original, rename_map)[source]#
Remaps keys in a dictionary according to the rename dictionary. Also can be used for lists where the entries in the list can be renamed
- Args:
original: list/dictionary of names and values that may need remapping rename_map: Dictionary mapping names (keys) {old: new}
- Returns:
new: List/dictionary containing the names remapped
- snowex_db.string_management.standardize_key(messy)[source]#
Preps a key for use in dataframe columns or dictionary. Makes everything lowercase, removes units, replaces spaces with underscores.
- Args:
messy: string to be cleaned
- Returns:
clean: String minus all characters and patterns of no interest
- snowex_db.string_management.strip_encapsulated(str_line, encapsulator)[source]#
Removes from a str anything thats encapusulated by characters and the encapsulating chars themselves
- Args:
str_line: String that has encapusulated info we want removed encapsulator: string of characters encapsulating info to be removed
- Returns:
final: String without anything between encapsulators