Tutorial 4: Loading existing s_cube objects and export options
flowTorch workshop 29.09.2025 - 02.10.2025
Outline
Load existing
s_cubeobjectsCreating a new HDF5 file for each exported fields
Exporting data in batches or snapshot-by-snapshot
In this tutorial we will briefly look at different options when exporting the data from \(S^3\). This is especially useful when dealing with large datasets, for which \(S^3\) was originally designed for. The first steps are the same ass presented in tutorial 1.
Prerequisites: Execution of the cylinder2D simulation from tutorial 1.
1. Loading existing s_cubeobjects
[1]:
import sys
import torch as pt
from typing import Union
from stl import mesh
from os.path import join
from os import environ, system
environ["sparseSpatialSampling"] = "../../.."
sys.path.insert(0, environ["sparseSpatialSampling"])
from sparseSpatialSampling.export import ExportData
from sparseSpatialSampling.utils import load_foam_data, load_original_Foam_fields
Warning: TecplotDataloader can't be loaded. Most likely, the 'paraview' module is missing.
Refer to the installation instructions at https://github.com/FlowModelingControl/flowtorch
If you are not using the TecplotDataloader, ignore this warning.
[2]:
# path to the CFD data and settings, assuming they are in the top-level of the repository
load_path = join("..", "..", "..", "run", "tutorials", "tutorial_1")
load_path_cfd = join("..", "..", "..", "flowTorch_Workshop_2025", "cylinder_2D_Re100")
# define the path to where we want to save the results and the name of the file
save_path = join("..","..", "..", "run", "tutorials", "tutorial_4")
[3]:
# load the s_cube object
s_cube = pt.load(join(load_path, "s_cube_cylinder2D_metric_0.75.pt"), weights_only=False)
# load the velocty and pressure field of the simulation
bounds = [[0, 0], [2.2, 0.41]]
field_U, coord, _, write_times = load_foam_data(load_path_cfd, bounds, field_name="U", t_start=8, scalar=False)
field_p, _, _, _ = load_foam_data(load_path_cfd, bounds, t_start=8)
[2025-08-15 11:42:17] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:17] INFO Loading precomputed cell centers and volumes from processor1/constant
[2025-08-15 11:42:21] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:21] INFO Loading precomputed cell centers and volumes from processor1/constant
2. Creating a new file for each field
In tutorial 1, we create a single HFD5 file containing all the data from our simulation. However, especially when dealing with large amounts of data, having a single large HDF5 file may be impractical. Instead, \(S^3\) allows us to create a single HDF5 file for each field so we end up with a few but smaller files which may be handled easier. The corresponding field name will be appended to each HDF5 file name.
[4]:
# instantiate an export object, here we want to create a new HDF5 file for each field
export = ExportData(s_cube, write_new_file_for_each_field=True)
# we have to overwrite the save_path and save_name, since we want to save this in another directory
export.save_dir = save_path
export.save_name = "cylinder2D_Re100_new_file"
# for demonstration purposes we only export a single snapshot
export.write_times = write_times[-1]
# now export the last snapshot of the velocity field
export.export(coord, field_U[:, :, -1].unsqueeze(-1), "U")
# now export the last snapshot of the pressure field into a new file
export.export(coord, field_p[:, -1].unsqueeze(-1).unsqueeze(-1), "p")
[2025-08-15 11:42:23] INFO Starting interpolation and export of field U.
[2025-08-15 11:42:23] INFO Writing HDF5 file for field U.
[2025-08-15 11:42:23] INFO Writing XDMF file for file cylinder2D_Re100_new_file_U.h5
[2025-08-15 11:42:23] INFO Finished export of field U in 0.101s.
[2025-08-15 11:42:23] INFO Starting interpolation and export of field p.
[2025-08-15 11:42:23] INFO Writing HDF5 file for field p.
[2025-08-15 11:42:23] INFO Writing XDMF file for file cylinder2D_Re100_new_file_p.h5
[2025-08-15 11:42:23] INFO Finished export of field p in 0.037s.
3. Exporting data in batches or snapshot-by-snapshot
So far we always loaded and exported the complete data matrix at once. However, for larger datasets it is very unlikely that all the data will fit into memory at once. To avoid this issue, instead of loading and exporting the complete data matrix at once, we can do it in batches or in case of very large snapshots, even snapshot-by-snapshot.
To make use of this functionality we only have to change the parameter n_snapshots_total in the export() method to n_snapshots_total=len(write_times). This is required, so that the export() method knows how many snapshots it is expecting.
The overall approach can be summarized as followed:
Load a certain number of snapshots \(N\), where \(1 \le N \le N_\mathrm{snapshots}\) and has to be chosen based on the memory requirements
Pass them to the
export()method as before, but pass the additional argumentn_snapshots_total=len(write_times)(total number of snapshots to export)Continue with 1. until all snapshots are exported
This procedure will be shown in the following. The function export_fields_snapshot_wise below creates an abstraction for easier usage.
Note: The following code will create an HDF5 and XDMF file which can’t be opened in ParaView when executed in a Jupyter notebook for some reason. In case you want to use this code productively, you have to copy it into a python script and execute it separately. Then everything works.
[5]:
def export_fields_snapshot_wise(load_dir: str, datawriter: ExportData, field_names: Union[str, list], boundaries: list,
write_times: Union[str, list], batch_size: int = 25) -> None:
"""
For each field specified, interpolate all snapshots onto the generated grid and export it to HDF5 & XDMF. The
interpolation and export of the data is performed snapshot-by-snapshot (batch_size = 1) or in batches to avoid out
of memory issues for large datasets.
:param load_dir: path to the simulation data
:param datawriter: DataWriter class after executing the S^3 algorithm
:param field_names: names of the fields to export
:param boundaries: boundaries of the masked area of the domain (needs to be the same as used for loading the
vertices and computing the metric)
:param write_times: the write times of the simulation
:param batch_size: batch size, number of snapshots which should be interpolated and exported at once
:return: None
"""
# make sure the type is correct
write_times = write_times if isinstance(write_times, list) else [write_times]
field_names = field_names if isinstance(field_names, list) else [field_names]
# set the write times in case we haven't done that already
if datawriter.write_times is None:
datawriter.write_times = write_times
# now loop over all fields
for f in field_names:
counter = 1
# compute the required number of batches
if not len(datawriter.write_times) % batch_size:
n_batches = int(len(datawriter.write_times) / batch_size)
else:
n_batches = int(len(datawriter.write_times) / batch_size) + 1
# now loop over all batches
for i in pt.arange(0, len(datawriter.write_times), step=batch_size).tolist():
print(f"Exporting batch {counter} / {n_batches}")
# load the required number of snapshots
coordinates, data = load_original_Foam_fields(load_dir, datawriter.n_dimensions, boundaries, field_names=f,
write_times=datawriter.write_times[i:i + batch_size])
# in case the field is not available, the export()-method will return None
if data is not None:
# export the current batch
datawriter.export(coordinates, data, f, n_snapshots_total=len(datawriter.write_times))
counter += 1
[6]:
# check how many snapshots we have
print(f"Number of snapshots: {len(write_times)}")
Number of snapshots: 1001
[7]:
# now we want to export the data for the last 500 snapshots of the velocity field in batches
export = ExportData(s_cube)
export.save_name = "cylinder2D_Re100"
export.save_dir = save_path
# batch_size = 1 would mean we export the data snapshot-by-snapshot. Since our data is very small we choose a larger batch size
export_fields_snapshot_wise(load_path_cfd, export, "U", bounds, write_times[-500:], batch_size=100)
[2025-08-15 11:42:23] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:23] INFO Loading precomputed cell centers and volumes from processor1/constant
Exporting batch 1 / 5
[2025-08-15 11:42:23] INFO Starting interpolation and export of field U.
[2025-08-15 11:42:24] INFO Writing HDF5 file for field U.
[2025-08-15 11:42:24] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:24] INFO Loading precomputed cell centers and volumes from processor1/constant
Exporting batch 2 / 5
[2025-08-15 11:42:25] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:25] INFO Loading precomputed cell centers and volumes from processor1/constant
Exporting batch 3 / 5
[2025-08-15 11:42:25] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:25] INFO Loading precomputed cell centers and volumes from processor1/constant
Exporting batch 4 / 5
[2025-08-15 11:42:26] INFO Loading precomputed cell centers and volumes from processor0/constant
[2025-08-15 11:42:26] INFO Loading precomputed cell centers and volumes from processor1/constant
Exporting batch 5 / 5
[2025-08-15 11:42:27] INFO Writing XDMF file for file cylinder2D_Re100.h5
[2025-08-15 11:42:27] INFO Finished export of field U in 4.153s.