{
 "cells": [
  {
   "metadata": {
    "collapsed": true
   },
   "cell_type": "markdown",
   "source": [
    "# Tutorial 5: Using the $\\mathbf{S^3}$ `Dataloader` and `Datawriter` classes to load and write arbitrary data to HDF5\n",
    "\n",
    "### Outline\n",
    "1. Using the $\\mathbf{S^3}$ `Dataloader` to load $\\mathbf{S^3}$ data\n",
    "2. Using the $\\mathbf{S^3}$ `Datawriter` to write arbitrary data to HDF5\n",
    "\n",
    "In this tutorial, we will briefly explain the usage of the $S^3$ `Dataloader` and `Datawriter` to load data generated by $S^3$ and write arbitrary data for an $S^3$ grid.\n",
    "\n",
    "As in the tutorials before, we first import the required modules and set all the paths:"
   ],
   "id": "6e83d4bf2e14bbf1"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-02-23T07:54:54.466608443Z",
     "start_time": "2026-02-23T07:54:51.030377665Z"
    }
   },
   "cell_type": "code",
   "source": [
    "import sys\n",
    "\n",
    "from os.path import join\n",
    "from os import environ, makedirs\n",
    "from os.path import exists\n",
    "\n",
    "environ[\"sparseSpatialSampling\"] = join(\"..\", \"..\", \"..\")\n",
    "sys.path.insert(0, environ[\"sparseSpatialSampling\"])\n",
    "\n",
    "from sparseSpatialSampling.utils import Datawriter, Dataloader, compute_svd"
   ],
   "id": "ef16c7b33c1d04b5",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Warning: TecplotDataloader can't be loaded. Most likely, the 'paraview' module is missing.\n",
      "Refer to the installation instructions at https://github.com/FlowModelingControl/flowtorch\n",
      "If you are not using the TecplotDataloader, ignore this warning.\n"
     ]
    }
   ],
   "execution_count": 1
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-02-23T07:54:54.494866210Z",
     "start_time": "2026-02-23T07:54:54.470619491Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# path to the CFD data and settings, assuming they are in the top-level of the repository\n",
    "load_path = join(\"..\", \"..\", \"..\", \"run\", \"tutorials\", \"tutorial_1\")\n",
    "\n",
    "# define the path to where we want to save the results and the name of the file\n",
    "save_path = join(\"..\",\"..\", \"..\", \"run\", \"tutorials\", \"tutorial_5\")\n",
    "\n",
    "# name of the HDF5 file\n",
    "file_name = \"cylinder2D_metric_0.75\"\n",
    "\n",
    "# create directory\n",
    "if not exists(save_path):\n",
    "    makedirs(save_path)"
   ],
   "id": "717a6b11168c8eab",
   "outputs": [],
   "execution_count": 2
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## 1. Using the $\\mathbf{S^3}$ `Dataloader` to load $\\mathbf{S^3}$ data\n",
    "\n",
    "We now want to use the $S^3$ `Dataloader` to access the data within the HDF5 file of [tutorial 1](tutorial1_cylinder2D_Re100.ipynb). Therefore, we instantiate the `Dataloader` as follows:"
   ],
   "id": "423c996e7d9485b"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-02-23T07:54:54.595029160Z",
     "start_time": "2026-02-23T07:54:54.513139970Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# instantiate a dataloader to load Scube data\n",
    "dataloader = Dataloader(load_path, f\"{file_name}.h5\")\n",
    "\n",
    "# print some infos about the contents of the HDF5 file\n",
    "print(\"Available fields:\", dataloader.field_names[dataloader.write_times[0]])\n",
    "print(\"Grid size:\", dataloader.vertices.size())\n",
    "\n",
    "# we can also load the metric field used to generate the grid and the cell area (or volume)\n",
    "print(dataloader.metric.size(), dataloader.weights.size())"
   ],
   "id": "c8c0cb2cc5751f5e",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Available fields: ['p']\n",
      "Grid size: torch.Size([3734, 2])\n",
      "torch.Size([3734]) torch.Size([3734])\n"
     ]
    }
   ],
   "execution_count": 3
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "We can access e.g. available write times, field names, the metric field and the cell volumes (`weights`) directly via the properties of the `dataloader` instance. We can also load snapshots using the `load_snapshot()`method for further post-processing. In the following step, we will compute an SVD of the velocity fields as an illustrative example:",
   "id": "adefa3bc37646a9c"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-02-23T07:54:54.841768352Z",
     "start_time": "2026-02-23T07:54:54.600704501Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# we want to load all available snapshots starting at t = 4s\n",
    "t_start, t_end = 4, 5\n",
    "write_times = sorted([t for t in dataloader.write_times if (t_start <= float(t) < t_end)], key=lambda x: float(x))\n",
    "\n",
    "# load the velocity field for the specified write times\n",
    "field = dataloader.load_snapshot(\"U\", write_times)\n",
    "\n",
    "# perform an SVD weighted with cell areas (dataloader.weights) as an example computation\n",
    "s, U, V = compute_svd(field, dataloader.weights)\n",
    "print(field.size(), s.shape, U.shape, V.shape)"
   ],
   "id": "f0b1c7f8fe3d9d8e",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3734, 2, 301]) torch.Size([124]) torch.Size([3734, 2, 124]) torch.Size([301, 124])\n"
     ]
    }
   ],
   "execution_count": 4
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "You can find the full documentation of the `Dataloader` in the [Python API](https://sparsespatialsampling.readthedocs.io/en/latest/sparseSpatialSampling.data.html).\n",
    "\n",
    "## 2. Using the $S^3$ `Datawriter` to write arbitrary data to HDF5\n",
    "In the previous tutorials, we used the `export()` method of the `ExportData` class to interpolate and export time-dependent field data to HDF5. However, the results of our SVD (or whatever post-processing you do) are not time-dependent. We could just dump it into a pickle file or similar, but then we wouldn't be able to visualize the POD modes, for example in `ParaView`. To solve this issue, we can use the $S^3$ `Datawriter` class to write any data into an HDF5 file.\n",
    "\n",
    "The `write_data()` method of the `Datawriter` class can write three kinds of groups of data:\n",
    "- `group=\"grid\"` writes the $S^3$ grid stored in the `Dataloader`\n",
    "- `group=\"constant\"` writes constant data (static in time)\n",
    "- `group=\"data\"` writes time-dependent data\n",
    "\n",
    "To visualize data later, we have to write a grid. We can do  it either manually, or use the `write_grid()` method for convenience, as shown below. Next, we write our results from the SVD. Since this data is static in time, we use the group `constant`. This data will be written into the first time step, in case we also write time-dependent data. Lastly, we want to write five snapshots of the velocity field to the group `data`, since the velocity field is time-dependent.\n",
    "\n",
    "**Note 1:** To visualize data, e.g., in ParaView, the fields written to the HDF5 files have to be of the dimension of the grid generated by $S^3$. All data which aren't of the grid dimensions are not possible to visualize in ParaView.\n",
    "\n",
    "**Note 2:** The `Datawriter` is not performing any kind of interpolation of data onto the $S^3$ grid (in contrast to the `ExportData` class)."
   ],
   "id": "7f29880ac381d304"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-02-23T07:54:54.960561541Z",
     "start_time": "2026-02-23T07:54:54.844759891Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# write the data to HDF5 & XDMF\n",
    "datawriter = Datawriter(save_path, f\"{file_name}_svd.h5\")\n",
    "\n",
    "# write the grid, for convenience we use the write_grid() method:\n",
    "datawriter.write_grid(dataloader)\n",
    "\n",
    "# we could also write the grid as:\n",
    "# datawriter.n_cells = dataloader.vertices.shape[0]\n",
    "# datawriter.write_data(\"centers\", group=\"grid\", data=dataloader.vertices)\n",
    "# datawriter.write_data(\"vertices\", group=\"grid\", data=dataloader.nodes)\n",
    "# datawriter.write_data(\"faces\", group=\"grid\", data=dataloader.faces)\n",
    "\n",
    "# set the max. number of modes to write, here we want to write all available modes\n",
    "n_modes = U.size(-1)\n",
    "\n",
    "# write the modes as vectors, where each mode is treated as an independent vector.\n",
    "# Since they have the dimensions of the grid we can visualize them in ParaView. However, they are not time-dependent, so we write them in group constant\n",
    "for i in range(n_modes):\n",
    "    if len(U.size()) == 2:\n",
    "        datawriter.write_data(f\"mode_{i + 1}\", group=\"constant\", data=U[:, i].squeeze())\n",
    "    else:\n",
    "        datawriter.write_data(f\"mode_{i + 1}\", group=\"constant\", data=U[:, :, i].squeeze())\n",
    "\n",
    "# write the remaining data. This data is not referenced in the XDMF file, since the size doesn't match the dimensions of the grid\n",
    "datawriter.write_data(\"V\", group=\"constant\", data=V)\n",
    "datawriter.write_data(\"s\", group=\"constant\", data=s)\n",
    "datawriter.write_data(\"cell_area\", group=\"constant\", data=dataloader.weights)\n",
    "\n",
    "# we can also write time-dependent data, for example the first 5 snapshots of the flow field.\n",
    "# Therefore, we have to write it in the group data. We can only write a single time step per call, hence,\n",
    "# for exporting temporal data the export() method is recommended to use\n",
    "for i, t in enumerate(write_times[:5]):\n",
    "    print(f\"Writing time step t = {t} s.\")\n",
    "    datawriter.write_data(\"U\", group=\"data\", data=field[:, :, i], time_step=t)\n",
    "\n",
    "# it is important to close the datawriter. Otherwise, we can't execute this cell multiple times, since jupyter is caching the file handler\n",
    "datawriter.close()\n",
    "\n",
    "# write XDMF file, so we can open it up in ParaView\n",
    "datawriter.write_xdmf_file()"
   ],
   "id": "90286bc4a4ec4884",
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[2026-02-23 08:54:54] INFO     Writing XDMF file for file cylinder2D_metric_0.75_svd.h5\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing time step t = 4.000 s.\n",
      "Writing time step t = 4.001 s.\n",
      "Writing time step t = 4.002 s.\n",
      "Writing time step t = 4.003 s.\n",
      "Writing time step t = 4.004 s.\n"
     ]
    }
   ],
   "execution_count": 5
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "We can now open up the `XDMF` file in `ParaView` and visualize the field data. This concludes tutorial 5.",
   "id": "99d0d588b9997b86"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "outputs": [],
   "execution_count": null,
   "source": "",
   "id": "4c3227ac830940ce"
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}