pylimer_tools.io package¶
Submodules¶
pylimer_tools.io.extract_thermo_data module¶
- pylimer_tools.io.extract_thermo_data.detect_headers(file: str, max_nr_of_lines_to_read: int = 1500, use_cache: bool = True) List[str] [source]¶
Read max_nr_of_lines_to_read lines from the given file and return all possible header lines.
Some assumptions are made regarding the columns, e.g., that 75% of them start with a character.
- Parameters:
file (str) – The file to search for header lines
max_nr_of_lines_to_read (int) – The number of lines to read in search for header lines. Use a negative number to read the whole file.
use_cache (bool) – Whether to read the result from cache or not. The cache is not read if the file changed meanwhile.
- Returns:
List of detected header lines
- Return type:
List[str]
- pylimer_tools.io.extract_thermo_data.extract_thermo_params(file, header: str | List[str] | None = 'Step Temp E_pair E_mol TotEng Press', texts_to_read: int = 50, min_line_len: int = 5, use_cache: bool = True, lines_to_read_to_detect_header: int = 100000, lines_to_read_till_header: float = -1) DataFrame [source]¶
Extract the thermodynamic outputs produced for this simulation, i.e., in LAMMPS, by the thermo command.
In particular, this function can handle log files, handle sections with different columns, and handles skipping over warnings as well as broken lines.
Note: The header parameter can be an array — make sure to pay attention when reading a file with different header sections in them.
- Parameters:
file (str) – The file path to the file to read from
header (Union[str, List[str], None]) – The header of the CSV (where to start reading at). Can be a string, a list of strings, or None if you want to try the detection.
texts_to_read (int) – The number of times to expect the header
min_line_len (int) – The minimal length of a line to be accepted as data
use_cache (bool) – Whether to use cache or not (though it will be written anyway). The cache is not read if the file changed meanwhile.
lines_to_read_to_detect_header (int) – The number of lines to read when trying to detect headers
lines_to_read_till_header (float) – The number of lines that are acceptable to skip until a header should have been found. This is useful for (a) finding the header, and (b) exit early if you are unsure about the header(s)
- Returns:
The thermodynamic parameters
- Return type:
pd.DataFrame
- pylimer_tools.io.extract_thermo_data.get_thermo_cache_name_suffix(header: str | List[str] | None = 'Step Temp E_pair E_mol TotEng Press', texts_to_read: float = 50, min_line_len: float = 5) str [source]¶
Compose a cache file suffix in such a way, that it distinguishes different thermo reader parameters.
- Parameters:
- Returns:
A string to be used as cache file suffix
- Return type:
- pylimer_tools.io.extract_thermo_data.read_multi_section_separated_value_file(file: str, separator: str | None = None, use_cache: bool = True, comment: str | None = None, skip_err: bool = False) DataFrame [source]¶
Reads a file with multiple sections that have different headers throughout the file.
This function handles files with multiple data sections that may have different column structures. It automatically detects the separator if not specified and combines all sections into a single DataFrame.
- Parameters:
file (str) – Path to the file to read
separator (Union[str, None]) – Character used to separate values in the file (auto-detected if None)
use_cache (bool) – Whether to use cached results if available
comment (Union[str, None]) – Character indicating the start of comments (e.g., “#”)
skip_err (bool) – Whether to skip errors when processing sections
- Returns:
Combined DataFrame containing all data from the file
- Return type:
pd.DataFrame
Note
Particularly useful for reading output files from the DPDSimulator or other multi-section files where the structure may change between sections.
- pylimer_tools.io.extract_thermo_data.read_one_group(fp, header, min_line_len=4, additional_lines_skip=0, lines_to_read_till_header=1000.0) str [source]¶
Read one group of csv lines from the file.
- Parameters:
fp (file object) – The file pointer to the file to read from
header (str or list) – The header of the CSV (where to start reading at)
min_line_len (int) – The minimal length of a line to be accepted as data
additional_lines_skip (int) – Number of lines to skip after reading the header
lines_to_read_till_header (float) – Maximum number of lines to read until finding the header
- Returns:
The filename of a temporary CSV file, or empty string if no data was read
- Return type:
pylimer_tools.io.read_pylimer_tools_output_file module¶
This module provides a few functions to read output from pylimer_tools_cpp’s simulators.
- pylimer_tools.io.read_pylimer_tools_output_file.read_avg_file(filename: str) DataFrame [source]¶
Read an averages-output file from one of the simulators shipped with pylimer_tools.
This function parses the output file format used by pylimer_tools_cpp simulators, handling multiple data sections and converting them to a pandas DataFrame. The function also caches results to improve performance on subsequent reads.
- Parameters:
filename (str) – Path to the averages file to read
- Returns:
DataFrame containing the parsed averages data, grouped by OutputStep
- Return type:
pd.DataFrame
- Note:
The function automatically filters out lines containing “-nan” values, null characters, or fewer than 3 columns.
- Note:
The returned DataFrame is grouped by OutputStep, keeping only the last entry for each step.
pylimer_tools.io.read_lammps_output_file module¶
This module provides a few functions to read LAMMPS’ output files, including:
log files (thermo output)
dump files (focusing on the coordinates of atoms)
data files (the LAMMPS structure)
averaged data (from
fix ave/time...
orfix ave/hist...
)correlation data (from
fix ave/correlate/...
)
- pylimer_tools.io.read_lammps_output_file.read_averages_file(filepath, use_cache: bool = True, sep=' ') DataFrame [source]¶
Read a file written by a fix ave/time command.
Uses pandas’ read_csv after detecting the columns.
- Important assumption: The first 2 or 3 lines in the file are:
comment,
then one header indicating the columns,
and then either data or potentially a second header, if it is a sectioned file (e.g., from a fix ave/time … vector)
- Parameters:
- Returns:
DataFrame containing the parsed average data
- Return type:
pd.DataFrame
- Raises:
FileNotFoundError – If the averages file does not exist
- pylimer_tools.io.read_lammps_output_file.read_correlation_file(filepath, group_key='Timestep', use_cache: bool = True) DataFrame [source]¶
Read a file written by a fix ave/correlate{/long} command.
- Parameters:
- Returns:
DataFrame containing the correlation data. Use the group_key with the DataFrame’s groupby() to restore the original sections.
- Return type:
pd.DataFrame
- Raises:
FileNotFoundError – If the correlation file does not exist
- pylimer_tools.io.read_lammps_output_file.read_data_file(structure_file: str, atom_style: List[AtomStyle] | None = None) Universe [source]¶
Read a file with LAMMPS’ data type of structure into a Universe.
- Parameters:
- Returns:
Universe object representing the molecular structure
- Return type:
- Raises:
FileNotFoundError – If the structure file does not exist
- pylimer_tools.io.read_lammps_output_file.read_dump_file(data_file, dump_file, atom_style: List[AtomStyle] | None = None) UniverseSequence [source]¶
Read a file with LAMMPS’ dump of snapshots of structures into a Universe.
- Parameters:
- Returns:
Sequence of Universe objects representing the trajectory
- Return type:
- pylimer_tools.io.read_lammps_output_file.read_histogram_file(filepath, use_cache: bool = True) DataFrame [source]¶
Read a file written by fix ave/hist or similar.
This is a wrapper around read_sectioned_averages_file for histogram data.
- Parameters:
- Returns:
DataFrame containing the parsed histogram data
- Return type:
pd.DataFrame
- See:
- pylimer_tools.io.read_lammps_output_file.read_log_file(filepath, lines_to_read_to_detect_header=500000) DataFrame [source]¶
Read a LAMMPS’ log (thermo output) file.
- pylimer_tools.io.read_lammps_output_file.read_sectioned_averages_file(filepath, use_cache: bool = True) DataFrame [source]¶
Read a file written by a fix ave/time command with multiple sections.
Use the section delimiter columns together with pandas’ groupby() to restore the original sections.
- Parameters:
- Returns:
DataFrame containing the parsed sectioned data
- Return type:
pd.DataFrame
- Raises:
FileNotFoundError – If the file does not exist
ValueError – If the file format is not recognized as a proper sectioned averages file
pylimer_tools.io.unit_styles module¶
- class pylimer_tools.io.unit_styles.UnitStyle(unit_configuration: dict, ureg: UnitRegistry)[source]¶
Bases:
object
UnitStyle: a collection of units of a particular LAMMPS unit style, but in SI units (i.e.: use this to convert your LAMMPS output data to SI units).
Example usage:
unit_style_factory = UnitStyleFactory() unit_style = unit_style_factory.get_unit_style( "lj", polymer="pdms", warning=False, accept_mol=True) # multiply with the following factor to convert LJ stress to SI units, # namely MPa in this example: lj_stress_to_si_conversion_factor = (1.*unit_style.pressure).to("MPa").magnitude
Initialize a UnitStyle object.
- Parameters:
unit_configuration (dict) – Dictionary containing unit definitions
ureg (UnitRegistry) – Pint unit registry to use for unit conversions
- __getattr__(property: str)[source]¶
Shorthand access for
get_base_unit_of()
.- Parameters:
property (str) – The property name to get the unit for
- Returns:
The unit object for the requested property
- Return type:
pint.Quantity
Example usage:
units = get_unit_style("lj") mass_with_units = mass_in_lj * units.mass
- get_base_unit_of(property: str)[source]¶
Returns the conversion factor from the unit style to SI units.
- Parameters:
property (str) – The property name to get the unit for (e.g., “mass”, “distance”)
- Returns:
The unit object for the requested property
- Return type:
pint.Quantity
Example usage:
units = get_unit_style("lj") mass_in_si = mass_in_lj * units.get_base_unit_of("mass")
- class pylimer_tools.io.unit_styles.UnitStyleFactory[source]¶
Bases:
object
This is a factory to get multiple instances of different
UnitStyle
using the same UnitRegistry, such that they are compatible.Initialize the UnitStyleFactory with a new UnitRegistry.
- get_available_polymers() list [source]¶
List all available polymers for which we have lj unit conversions.
- Returns:
List of polymer names
- Return type:
- get_everares_et_al_data() DataFrame [source]¶
Load the Everaers et al. (2020) unit properties data.
- Returns:
DataFrame containing polymer properties from Everaers et al.
- Return type:
pd.DataFrame
- get_unit_registry()[source]¶
Get the underlying unit registry.
- Returns:
The unit registry used by this factory
- Return type:
UnitRegistry
- get_unit_style(unit_type: str, dimension: int = 3, **kwargs) UnitStyle [source]¶
Get a UnitStyle instance corresponding to the unit system requested.
- Parameters:
- Returns:
A UnitStyle object for the requested unit system
- Return type:
- Raises:
ValueError – If required parameters are missing
NotImplementedError – If the requested unit type is not implemented
For LJ units, you must specify the polymer using the polymer parameter.
See also
Warning
Please check the source code of this function to see whether the units you need are correctly implemented
Module contents¶
This module contains various utility functions for working with LAMMPS and pylimer_tools in- and output files.