Tutorial 5: Using the \(\mathbf{S^3}\) Dataloader and Datawriter classes to load and write arbitrary data to HDF5
Outline
Using the \(\mathbf{S^3}\)
Dataloaderto load \(\mathbf{S^3}\) dataUsing the \(\mathbf{S^3}\)
Datawriterto write arbitrary data to HDF5
In this tutorial, we will briefly explain the usage of the \(S^3\) Dataloader and Datawriter to load data generated by \(S^3\) and write arbitrary data for an \(S^3\) grid.
As in the tutorials before, we first import the required modules and set all the paths:
[1]:
import sys
from os.path import join
from os import environ, makedirs
from os.path import exists
environ["sparseSpatialSampling"] = join("..", "..", "..")
sys.path.insert(0, environ["sparseSpatialSampling"])
from sparseSpatialSampling.utils import Datawriter, Dataloader, compute_svd
Warning: TecplotDataloader can't be loaded. Most likely, the 'paraview' module is missing.
Refer to the installation instructions at https://github.com/FlowModelingControl/flowtorch
If you are not using the TecplotDataloader, ignore this warning.
[2]:
# path to the CFD data and settings, assuming they are in the top-level of the repository
load_path = join("..", "..", "..", "run", "tutorials", "tutorial_1")
# define the path to where we want to save the results and the name of the file
save_path = join("..","..", "..", "run", "tutorials", "tutorial_5")
# name of the HDF5 file
file_name = "cylinder2D_metric_0.75"
# create directory
if not exists(save_path):
makedirs(save_path)
1. Using the \(\mathbf{S^3}\) Dataloader to load \(\mathbf{S^3}\) data
We now want to use the \(S^3\) Dataloader to access the data within the HDF5 file of tutorial 1. Therefore, we instantiate the Dataloader as follows:
[3]:
# instantiate a dataloader to load Scube data
dataloader = Dataloader(load_path, f"{file_name}.h5")
# print some infos about the contents of the HDF5 file
print("Available fields:", dataloader.field_names[dataloader.write_times[0]])
print("Grid size:", dataloader.vertices.size())
# we can also load the metric field used to generate the grid and the cell area (or volume)
print(dataloader.metric.size(), dataloader.weights.size())
Available fields: ['p']
Grid size: torch.Size([3734, 2])
torch.Size([3734]) torch.Size([3734])
We can access e.g. available write times, field names, the metric field and the cell volumes (weights) directly via the properties of the dataloader instance. We can also load snapshots using the load_snapshot()method for further post-processing. In the following step, we will compute an SVD of the velocity fields as an illustrative example:
[4]:
# we want to load all available snapshots starting at t = 4s
t_start, t_end = 4, 5
write_times = sorted([t for t in dataloader.write_times if (t_start <= float(t) < t_end)], key=lambda x: float(x))
# load the velocity field for the specified write times
field = dataloader.load_snapshot("U", write_times)
# perform an SVD weighted with cell areas (dataloader.weights) as an example computation
s, U, V = compute_svd(field, dataloader.weights)
print(field.size(), s.shape, U.shape, V.shape)
torch.Size([3734, 2, 301]) torch.Size([124]) torch.Size([3734, 2, 124]) torch.Size([301, 124])
You can find the full documentation of the Dataloader in the Python API.
2. Using the \(S^3\) Datawriter to write arbitrary data to HDF5
In the previous tutorials, we used the export() method of the ExportData class to interpolate and export time-dependent field data to HDF5. However, the results of our SVD (or whatever post-processing you do) are not time-dependent. We could just dump it into a pickle file or similar, but then we wouldn’t be able to visualize the POD modes, for example in ParaView. To solve this issue, we can use the \(S^3\) Datawriter class to write any data into an HDF5 file.
The write_data() method of the Datawriter class can write three kinds of groups of data:
group="grid"writes the \(S^3\) grid stored in theDataloadergroup="constant"writes constant data (static in time)group="data"writes time-dependent data
To visualize data later, we have to write a grid. We can do it either manually, or use the write_grid() method for convenience, as shown below. Next, we write our results from the SVD. Since this data is static in time, we use the group constant. This data will be written into the first time step, in case we also write time-dependent data. Lastly, we want to write five snapshots of the velocity field to the group data, since the velocity field is time-dependent.
Note 1: To visualize data, e.g., in ParaView, the fields written to the HDF5 files have to be of the dimension of the grid generated by \(S^3\). All data which aren’t of the grid dimensions are not possible to visualize in ParaView.
Note 2: The Datawriter is not performing any kind of interpolation of data onto the \(S^3\) grid (in contrast to the ExportData class).
[5]:
# write the data to HDF5 & XDMF
datawriter = Datawriter(save_path, f"{file_name}_svd.h5")
# write the grid, for convenience we use the write_grid() method:
datawriter.write_grid(dataloader)
# we could also write the grid as:
# datawriter.n_cells = dataloader.vertices.shape[0]
# datawriter.write_data("centers", group="grid", data=dataloader.vertices)
# datawriter.write_data("vertices", group="grid", data=dataloader.nodes)
# datawriter.write_data("faces", group="grid", data=dataloader.faces)
# set the max. number of modes to write, here we want to write all available modes
n_modes = U.size(-1)
# write the modes as vectors, where each mode is treated as an independent vector.
# Since they have the dimensions of the grid we can visualize them in ParaView. However, they are not time-dependent, so we write them in group constant
for i in range(n_modes):
if len(U.size()) == 2:
datawriter.write_data(f"mode_{i + 1}", group="constant", data=U[:, i].squeeze())
else:
datawriter.write_data(f"mode_{i + 1}", group="constant", data=U[:, :, i].squeeze())
# write the remaining data. This data is not referenced in the XDMF file, since the size doesn't match the dimensions of the grid
datawriter.write_data("V", group="constant", data=V)
datawriter.write_data("s", group="constant", data=s)
datawriter.write_data("cell_area", group="constant", data=dataloader.weights)
# we can also write time-dependent data, for example the first 5 snapshots of the flow field.
# Therefore, we have to write it in the group data. We can only write a single time step per call, hence,
# for exporting temporal data the export() method is recommended to use
for i, t in enumerate(write_times[:5]):
print(f"Writing time step t = {t} s.")
datawriter.write_data("U", group="data", data=field[:, :, i], time_step=t)
# it is important to close the datawriter. Otherwise, we can't execute this cell multiple times, since jupyter is caching the file handler
datawriter.close()
# write XDMF file, so we can open it up in ParaView
datawriter.write_xdmf_file()
[2026-02-23 08:54:54] INFO Writing XDMF file for file cylinder2D_metric_0.75_svd.h5
Writing time step t = 4.000 s.
Writing time step t = 4.001 s.
Writing time step t = 4.002 s.
Writing time step t = 4.003 s.
Writing time step t = 4.004 s.
We can now open up the XDMF file in ParaView and visualize the field data. This concludes tutorial 5.
[ ]: