diff --git a/README.rst b/README.rst
index 97d607da..2339e482 100644
--- a/README.rst
+++ b/README.rst
@@ -1,70 +1,20 @@
 =============
 LenslessPiCam
 =============
+Ahmed Elalamy (324610), Seif Hamed (312081), Ghita Tagemouati (330383)
 
-.. image:: https://readthedocs.org/projects/lensless/badge/?version=latest
-    :target: http://lensless.readthedocs.io/en/latest/
-    :alt: Documentation Status
-
-
-.. image:: https://joss.theoj.org/papers/10.21105/joss.04747/status.svg
-      :target: https://doi.org/10.21105/joss.04747
-      :alt: DOI
-
-.. image:: https://static.pepy.tech/badge/lensless
-      :target: https://www.pepy.tech/projects/lensless
-      :alt: Downloads
-
-
-*A Hardware and Software Toolkit for Lensless Computational Imaging with a Raspberry Pi*
------------------------------------------------------------------------------------------
-
-.. image:: https://github.com/LCAV/LenslessPiCam/raw/main/scripts/recon/example.png
-    :alt: Lensless imaging example
-    :align: center
-
-
-This toolkit has everything you need to perform imaging with a lensless
-camera. We make use of a low-cost implementation of DiffuserCam [1]_, 
-where we use a piece of tape instead of the lens and the
-`Raspberry Pi HQ camera sensor <https://www.raspberrypi.com/products/raspberry-pi-high-quality-camera>`__
-(the `V2 sensor <https://www.raspberrypi.com/products/camera-module-v2/>`__
-is also supported). Similar principles and methods can be used for a
-different lensless encoder and a different sensor. 
-
-*If you are interested in exploring reconstruction algorithms without building the camera, that is entirely possible!*
-The provided reconstruction algorithms can be used with the provided data or simulated data.
-
-We've also written a few Medium articles to guide users through the process
-of building the camera, measuring data with it, and reconstruction.
-They are all laid out in `this post <https://medium.com/@bezzam/a-complete-lensless-imaging-tutorial-hardware-software-and-algorithms-8873fa81a660>`__.
+Our work is mainly done in the trainable_mask.py and mask.py files.
 
 Setup 
 -----
 
-If you are just interested in using the reconstruction algorithms and 
-plotting / evaluation tools you can install the package via ``pip``:
+First, install the lensless package
 
 .. code:: bash
 
    pip install lensless
 
 
-For plotting, you may also need to install
-`Tk <https://stackoverflow.com/questions/5459444/tkinter-python-may-not-be-configured-for-tk>`__.
-
-
-For performing measurements, the expected workflow is to have a local 
-computer which interfaces remotely with a Raspberry Pi equipped with 
-the HQ camera sensor (or V2 sensor). Instructions on building the camera
-can be found `here <https://lensless.readthedocs.io/en/latest/building.html>`__.
-
-The software from this repository has to be installed on **both** your
-local machine and the Raspberry Pi. Note that we highly recommend using
-Python 3.9, as some Python library versions may not be available with 
-earlier versions of Python. Moreover, its `end-of-life <https://endoflife.date/python>`__ 
-is Oct 2025.
-
 *Local machine setup*
 =====================
 
@@ -92,101 +42,8 @@ install the library locally.
 
    # extra dependencies for local machine for plotting/reconstruction
    pip install -r recon_requirements.txt
+   pip install -r mask_requirements.txt
 
-   # (optional) try reconstruction on local machine
-   python scripts/recon/admm.py
-
-   # (optional) try reconstruction on local machine with GPU
-   python scripts/recon/admm.py -cn pytorch
-
-
-Note (25-04-2023): for using the :py:class:`~lensless.recon.apgd.APGD` reconstruction method based on Pycsou
-(now `Pyxu <https://github.com/matthieumeo/pyxu>`__), a specific commit has 
-to be installed (as there was no release at the time of implementation):
-
-.. code:: bash
-
-   pip install git+https://github.com/matthieumeo/pycsou.git@38e9929c29509d350a7ff12c514e2880fdc99d6e
-
-If PyTorch is installed, you will need to be sure to have PyTorch 2.0 or higher, 
-as Pycsou is not compatible with earlier versions of PyTorch. Moreover, 
-Pycsou requires Python within 
-`[3.9, 3.11) <https://github.com/matthieumeo/pycsou/blob/v2-dev/setup.cfg#L28>`__.
-
-Moreover, ``numba`` (requirement for Pycsou V2) may require an older version of NumPy:
-
-.. code:: bash
-
-   pip install numpy==1.23.5
-
-*Raspberry Pi setup*
-====================
-
-After `flashing your Raspberry Pi with SSH enabled <https://medium.com/@bezzam/setting-up-a-raspberry-pi-without-a-monitor-headless-9a3c2337f329>`__, 
-you need to set it up for `passwordless access <https://medium.com/@bezzam/headless-and-passwordless-interfacing-with-a-raspberry-pi-ssh-453dd75154c3>`__. 
-Do not set a password for your SSH key pair, as this will not work with the
-provided scripts.
-
-On the Raspberry Pi, you can then run the following commands (from the ``home`` 
-directory):
-
-.. code:: bash
-
-   # dependencies
-   sudo apt-get install -y libimage-exiftool-perl libatlas-base-dev \
-   python3-numpy python3-scipy python3-opencv
-   sudo pip3 install -U virtualenv
-
-   # download from GitHub
-   git clone git@github.com:LCAV/LenslessPiCam.git
-
-   # install in virtual environment
-   cd LenslessPiCam
-   virtualenv --system-site-packages -p python3 lensless_env
-   source lensless_env/bin/activate
-   pip install --no-deps -e .
-   pip install -r rpi_requirements.txt
-   
-
-Acknowledgements
-----------------
-
-The idea of building a lensless camera from a Raspberry Pi and a piece of 
-tape comes from Prof. Laura Waller's group at UC Berkeley. So a huge kudos 
-to them for the idea and making tools/code/data available! Below is some of 
-the work that has inspired this toolkit:
-
-* `Build your own DiffuserCam tutorial <https://waller-lab.github.io/DiffuserCam/tutorial>`__.
-* `DiffuserCam Lensless MIR Flickr dataset <https://waller-lab.github.io/LenslessLearning/dataset.html>`__ [2]_. 
-
-A few students at EPFL have also contributed to this project:
-
-* Julien Sahli: support and extension of algorithms for 3D.
-* Yohann Perron: unrolled algorithms for reconstruction.
-
-Citing this work
-----------------
-
-If you use these tools in your own research, please cite the following:
-
-::
-
-   @article{Bezzam2023,
-      doi = {10.21105/joss.04747},
-      url = {https://doi.org/10.21105/joss.04747},
-      year = {2023},
-      publisher = {The Open Journal},
-      volume = {8},
-      number = {86},
-      pages = {4747},
-      author = {Eric Bezzam and Sepand Kashani and Martin Vetterli and Matthieu Simeoni},
-      title = {LenslessPiCam: A Hardware and Software Platform for Lensless Computational Imaging with a Raspberry Pi},
-      journal = {Journal of Open Source Software}
-   }
-
-References
-----------
-
-.. [1] Antipa, N., Kuo, G., Heckel, R., Mildenhall, B., Bostan, E., Ng, R., & Waller, L. (2018). DiffuserCam: lensless single-exposure 3D imaging. Optica, 5(1), 1-9.
+   # training with the height varying mask
+   python scripts/recon/train_unrolled.py -cn train_heightvarying
 
-.. [2] Monakhova, K., Yurtsever, J., Kuo, G., Antipa, N., Yanny, K., & Waller, L. (2019). Learned reconstructions for practical mask-based lensless imaging. Optics express, 27(20), 28075-28090.
diff --git a/configs/train_coded_aperture.yaml b/configs/train_coded_aperture.yaml
new file mode 100644
index 00000000..99936b0b
--- /dev/null
+++ b/configs/train_coded_aperture.yaml
@@ -0,0 +1,55 @@
+# python scripts/recon/train_unrolled.py -cn train_coded_aperture
+defaults:
+  - train_unrolledADMM
+  - _self_
+
+# Train Dataset
+files:
+  dataset: mnist   # Simulated : "mnist", "fashion_mnist", "cifar10", "CelebA". Measure :"DiffuserCam"
+  celeba_root: /scratch/bezzam
+  downsample: 16     # TODO use downsample simulation instead? 
+  n_files: 100
+  crop:
+    vertical: [810, 2240]
+    horizontal: [1310, 2750]
+
+torch_device: "cuda" 
+
+optimizer:
+  # type: Adam   # Adam, SGD...
+  # lr: 1e-4
+  type: SGD
+  lr: 0.01
+
+#Trainable Mask
+trainable_mask:
+  mask_type: TrainableCodedAperture
+  # optimizer: Adam
+  # mask_lr: 1e-3
+  optimizer: SGD
+  mask_lr: 0.01
+  L1_strength: False
+  binary: False
+  initial_value:
+    psf_wavelength: [550e-9]
+    method: MLS
+    n_bits: 8  # (2**n_bits-1, 2**n_bits-1)
+    # method: MURA
+    # n_bits: 25  # (4*nbits*1, 4*nbits*1)
+    # # -- applicable for phase masks
+    # design_wv: 550e-9   
+
+simulation:
+  grayscale: True
+  flip: False
+  scene2mask: 40e-2
+  mask2sensor: 2e-3
+  sensor: "rpi_hq"
+  object_height: 0.30
+
+training:
+  crop_preloss: True # crop region for computing loss
+  batch_size: 4
+  epoch: 25
+  eval_batch_size: 16
+  save_every: 1
diff --git a/configs/train_heightvarying.yaml b/configs/train_heightvarying.yaml
new file mode 100644
index 00000000..dfdc64f1
--- /dev/null
+++ b/configs/train_heightvarying.yaml
@@ -0,0 +1,43 @@
+# python scripts/recon/train_unrolled.py -cn train_multilens_array
+defaults:
+  - train_unrolledADMM
+  - _self_
+
+# Train Dataset
+files:
+  dataset: fashion_mnist   # Simulated : "mnist", "fashion_mnist", "cifar10", "CelebA". Measure :"DiffuserCam"
+  celeba_root: /scratch/bezzam
+  downsample: 16     # TODO use simulation instead?   
+  n_files: 100
+  crop:
+    vertical: [810, 2240]
+    horizontal: [1310, 2750]
+
+torch_device: "cuda:0" 
+
+#Trainable Mask
+trainable_mask:
+  mask_type: TrainableHeightVarying
+  optimizer: Adam
+  mask_lr: 1e-3
+  L1_strength: False
+  binary: False
+  initial_value: 
+    psf_wavelength: [550e-9]
+    design_wv: 550e-9
+
+simulation:
+  grayscale: True
+  flip: False
+  scene2mask: 40e-2
+  mask2sensor: 2e-3
+  sensor: "rpi_hq"
+  downsample: 16
+  object_height: 0.30
+
+training:
+  crop_preloss: True # crop region for computing loss
+  batch_size: 2
+  epoch: 25
+  eval_batch_size: 16
+  save_every: 1
diff --git a/configs/train_multilens_array.yaml b/configs/train_multilens_array.yaml
new file mode 100644
index 00000000..cbfbf48c
--- /dev/null
+++ b/configs/train_multilens_array.yaml
@@ -0,0 +1,44 @@
+# python scripts/recon/train_unrolled.py -cn train_multilens_array
+defaults:
+  - train_unrolledADMM
+  - _self_
+
+# Train Dataset
+files:
+  dataset: fashion_mnist   # Simulated : "mnist", "fashion_mnist", "cifar10", "CelebA". Measure :"DiffuserCam"
+  celeba_root: /scratch/bezzam
+  downsample: 16     # TODO use simulation instead?   
+  n_files: 100
+  crop:
+    vertical: [810, 2240]
+    horizontal: [1310, 2750]
+
+torch_device: "cpu" 
+
+#Trainable Mask
+trainable_mask:
+  mask_type: TrainableMultiLensArray
+  optimizer: Adam
+  mask_lr: 1e-3
+  L1_strength: False
+  binary: False
+  initial_value:
+    N : 10 #TODO: check this value ? 
+    psf_wavelength: [550e-9]
+    design_wv: 550e-9
+
+simulation:
+  grayscale: True
+  flip: False
+  scene2mask: 40e-2
+  mask2sensor: 2e-3
+  sensor: "rpi_hq"
+  downsample: 16
+  object_height: 0.30
+
+training:
+  crop_preloss: True # crop region for computing loss
+  batch_size: 2
+  epoch: 25
+  eval_batch_size: 16
+  save_every: 1
diff --git a/configs/train_unrolledADMM.yaml b/configs/train_unrolledADMM.yaml
index f7602f01..918f2eec 100644
--- a/configs/train_unrolledADMM.yaml
+++ b/configs/train_unrolledADMM.yaml
@@ -84,6 +84,7 @@ trainable_mask:
   initial_value: psf
   grayscale: False
   mask_lr: 1e-3
+  optimizer: Adam  # Adam, SGD... (Pytorch class)
   L1_strength: 1.0  #False or float
 
 target: "object_plane"    # "original" or "object_plane" or "label"
@@ -129,7 +130,7 @@ training:
   crop_preloss: True # crop region for computing loss
 
 optimizer:
-  type: Adam
+  type: Adam  # Adam, SGD... (Pytorch class)
   lr: 1e-4
   slow_start: False  #float how much to reduce lr for first epoch
   # Decay LR in step fashion: https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.StepLR.html
diff --git a/data/psf.tiff b/data/psf.tiff
new file mode 100644
index 00000000..0be2fde3
Binary files /dev/null and b/data/psf.tiff differ
diff --git a/lensless/eval/benchmark.py b/lensless/eval/benchmark.py
index 8abd254e..61abe77f 100644
--- a/lensless/eval/benchmark.py
+++ b/lensless/eval/benchmark.py
@@ -124,8 +124,8 @@ def benchmark(
 
             for i, idx in enumerate(batch_idx):
                 if idx in save_idx:
-                    prediction_np = prediction.cpu().numpy()[i].squeeze()
-                    # switch to [H, W, C]
+                    prediction_np = prediction.cpu().numpy()[i]
+                    # switch to [H, W, C] for saving
                     prediction_np = np.moveaxis(prediction_np, 0, -1)
                     save_image(prediction_np, fp=os.path.join(output_dir, f"{idx}.png"))
 
diff --git a/lensless/hardware/mask.py b/lensless/hardware/mask.py
index f9597bf5..2c21d98c 100644
--- a/lensless/hardware/mask.py
+++ b/lensless/hardware/mask.py
@@ -32,6 +32,7 @@
 from waveprop.noise import add_shot_noise
 from lensless.hardware.sensor import VirtualSensor
 from lensless.utils.image import resize
+from matplotlib import pyplot as plt
 
 try:
     import torch
@@ -53,6 +54,8 @@ def __init__(
         size=None,
         feature_size=None,
         psf_wavelength=[460e-9, 550e-9, 640e-9],
+        is_torch=False,
+        torch_device="cpu",
         **kwargs
     ):
         """
@@ -71,7 +74,6 @@ def __init__(
         psf_wavelength: list, optional
             List of wavelengths to simulate PSF (m). Default is [460e-9, 550e-9, 640e-9] nm (blue, green, red).
         """
-
         resolution = np.array(resolution)
         assert len(resolution) == 2, "Sensor resolution should be of length 2"
 
@@ -94,8 +96,8 @@ def __init__(
             assert np.all(feature_size > 0), "Feature size should be positive"
         assert np.all(resolution * feature_size <= size)
 
-        self.phase_mask = None
         self.resolution = resolution
+        self.resolution = (int(self.resolution[0]), int(self.resolution[1]))
         self.size = size
         if feature_size is None:
             self.feature_size = self.size / self.resolution
@@ -103,18 +105,21 @@ def __init__(
             self.feature_size = feature_size
         self.distance_sensor = distance_sensor
 
+        self.is_torch = is_torch
+        self.torch_device = torch_device
+
         # create mask
-        self.mask = None
         self.create_mask()
         self.shape = self.mask.shape
 
         # PSF
+        assert hasattr(psf_wavelength, "__len__"), "psf_wavelength should be a list"
         self.psf_wavelength = psf_wavelength
         self.psf = None
         self.compute_psf()
 
     @classmethod
-    def from_sensor(cls, sensor_name, downsample=None, **kwargs):
+    def from_sensor(cls, sensor_name, downsample=None,**kwargs):
         """
         Constructor from an existing virtual sensor that copies over the sensor parameters
         (sensor resolution, sensor size, feature size).
@@ -156,20 +161,33 @@ def compute_psf(self):
         Compute the intensity PSF with bandlimited angular spectrum (BLAS) for each wavelength.
         Common to all types of masks.
         """
-        psf = np.zeros(tuple(self.resolution) + (len(self.psf_wavelength),), dtype=np.complex64)
+        if self.is_torch:
+            psf = torch.zeros(
+                tuple(self.resolution) + (len(self.psf_wavelength),),
+                dtype=torch.complex64,
+                device=self.torch_device,
+            )
+        else:
+            psf = np.zeros(tuple(self.resolution) + (len(self.psf_wavelength),), dtype=np.complex64)
         for i, wv in enumerate(self.psf_wavelength):
             psf[:, :, i] = angular_spectrum(
                 u_in=self.mask,
                 wv=wv,
                 d1=self.feature_size,
                 dz=self.distance_sensor,
-                dtype=np.float32,
+                dtype=np.float32 if not self.is_torch else torch.float32,
                 bandlimit=True,
+                device=self.torch_device if self.is_torch else None,
             )[0]
 
         # intensity PSF
-        self.psf = np.abs(psf) ** 2
-
+        if self.is_torch:
+            self.psf = torch.abs(psf) ** 2
+            self.psf.to(self.torch_device)
+        else:
+            self.psf = np.abs(psf) ** 2
+            
+        
 
 class CodedAperture(Mask):
     """
@@ -197,33 +215,69 @@ def __init__(self, method="MLS", n_bits=8, **kwargs):
         self.method = method
         self.n_bits = n_bits
 
+        assert self.method.upper() in ["MURA", "MLS"], "Method should be either 'MLS' or 'MURA'"
+        # TODO? use: https://github.com/bpops/codedapertures
+
+        # initialize parameters
+        if self.method.upper() == "MURA":
+            self.mask = self.squarepattern(4 * self.n_bits + 1)
+            self.row = None
+            self.col = None
+        else:
+            seq = max_len_seq(self.n_bits)[0]
+            self.row = seq
+            self.col = seq
+
+        if "is_torch" in kwargs and kwargs["is_torch"]:
+            torch_device = kwargs["torch_device"] if "torch_device" in kwargs else "cpu"
+            if self.row is not None and self.col is not None:
+                self.row = torch.from_numpy(self.row).float().to(torch_device)
+                self.col = torch.from_numpy(self.col).float().to(torch_device)
+            else:
+                self.mask = torch.from_numpy(self.mask).float().to(torch_device)
+
+        # needs to be done at the end as it calls create_mask
         super().__init__(**kwargs)
 
-    def create_mask(self):
+    def create_mask(self, row=None, col=None, mask=None):
         """
-        Creating coded aperture mask using either the MURA of MLS method.
+        Creating coded aperture mask.
         """
-        assert self.method.upper() in ["MURA", "MLS"], "Method should be either 'MLS' or 'MURA'"
 
-        # Generating pattern
-        if self.method.upper() == "MURA":
-            self.mask = self.squarepattern(4 * self.n_bits + 1)[1:, 1:]
-            self.row = 2 * self.mask[0, :] - 1
-            self.col = 2 * self.mask[:, 0] - 1
+        if mask is not None:
+            raise NotImplementedError("Mask loading not implemented yet.")
+
+        # if row and col are provided, use them
+        if row is None and col is None:
+            row = self.row
+            col = self.col
+
+        # outer product
+        if row is not None and col is not None:
+            if self.is_torch:
+                self.mask = torch.outer(row, col)
+            else:
+                self.mask = np.outer(row, col)
         else:
-            seq = max_len_seq(self.n_bits)[0] * 2 - 1
-            h_r = np.r_[seq, seq]
-            self.row = h_r
-            self.col = h_r
-            self.mask = (np.outer(h_r, h_r) + 1) / 2
+            assert self.mask is not None
 
-        # Upscaling
+        # resize to sensor shape
         if np.any(self.resolution != self.mask.shape):
-            upscaled_mask = resize(
-                self.mask[:, :, np.newaxis], shape=tuple(self.resolution) + (1,)
-            ).squeeze()
-            upscaled_mask = np.clip(upscaled_mask, 0, 1)
-            self.mask = np.round(upscaled_mask).astype(int)
+
+            if self.is_torch:
+                self.mask = self.mask.unsqueeze(0).unsqueeze(0)
+                self.mask = torch.nn.functional.interpolate(
+                    self.mask, size=tuple(self.resolution), mode="nearest"
+                ).squeeze()
+            else:
+                # self.mask = resize(self.mask[:, :, np.newaxis], shape=tuple(self.resolution) + (1,))
+                self.mask = resize(
+                    self.mask[:, :, np.newaxis],
+                    shape=tuple(self.resolution) + (1,),
+                    interpolation=cv.INTER_NEAREST,
+                ).squeeze()
+
+        # assert np.all(np.unique(self.mask) == np.array([0, 1]))
 
     def is_prime(self, n):
         """
@@ -247,6 +301,7 @@ def squarepattern(self, p):
         p: int
             Number of bits.
         """
+
         if not self.is_prime(p):
             raise ValueError("p is not a valid length. It must be prime.")
         A = np.zeros((p, p), dtype=int)
@@ -321,6 +376,150 @@ def simulate(self, obj, snr_db=20):
 
         return meas
 
+class MultiLensArray(Mask):
+    """
+    Multi-lens array mask.
+    """
+    def __init__(
+        self, N = None, radius = None, loc = None, refractive_index = 1.2, design_wv=532e-9, seed = 0, min_height=1e-5, radius_range=(1e-5, 1e-3), **kwargs
+    ):
+        """
+        Multi-lens array mask constructor.
+
+        Parameters
+        ----------
+        N: int
+            Number of lenses
+        radius: array_like
+            Radius of the lenses (m)
+        loc: array_like of tuples
+            Location of the lenses (m)
+        refractive_index: float
+            Refractive index of the mask substrate. Default is 1.2.
+        wavelength: float
+        seed: int
+            Seed for the random number generator. Default is 0.
+        min_height: float
+            Minimum height of the lenses (m). Default is 1e-3.
+        """
+        self.N = N
+        self.radius = radius
+        self.loc = loc
+        self.refractive_index = refractive_index
+        self.wavelength = design_wv
+        self.seed = seed
+        self.min_height = min_height
+        self.radius_range = radius_range
+
+        self.torch_device = kwargs["torch_device"] if "torch_device" in kwargs else "cpu"
+        self.is_torch = kwargs["is_torch"] if "is_torch" in kwargs else False
+        self.size = kwargs["size"] if "size" in kwargs else None
+        
+        super().__init__(**kwargs)
+    
+    def check_asserts(self):
+        assert self.radius_range[0] < self.radius_range[1], "Minimum radius should be smaller than maximum radius"
+        if self.radius is not None:
+            if self.is_torch:
+                assert torch.all(self.radius >= 0)
+            else:
+                assert np.all(self.radius >= 0)
+            assert self.loc is not None, "Location of the lenses should be specified if their radius is specified"
+            assert len(self.radius) == len(self.loc), "Number of radius should be equal to the number of locations"
+            #self.radius = torch.clamp(self.radius, min=self.radius_range[0], max=self.radius_range[1]).to(self.torch_device) if self.is_torch else np.clip(self.radius, self.radius_range[0], self.radius_range[1]) 
+            self.N = len(self.radius)
+            circles = np.array([(self.loc[i][0], self.loc[i][1], self.radius[i]) for i in range(self.N)]) if not self.is_torch else torch.tensor([(self.loc[i][0], self.loc[i][1], self.radius[i]) for i in range(self.N)]).to(self.torch_device)
+            assert self.no_circle_overlap(circles), "lenses should not overlap"
+        else:
+            assert self.N is not None, "If positions are not specified, the number of lenses should be specified"
+            if self.is_torch:
+                torch.manual_seed(self.seed)
+                radius = torch.rand(self.N).to(self.torch_device) * (self.radius_range[1] - self.radius_range[0]) + self.radius_range[0]
+                self.radius = torch.sort(radius, descending=True)[0].to(self.torch_device)
+                self.loc, _ = self.place_spheres_on_plane(self.size[0], self.size[1], self.radius)
+            else:
+                np.random.seed(self.seed)
+                self.radius = np.random.uniform(self.radius_range[0], self.radius_range[1], self.N)
+            assert self.N == len(self.radius)
+
+    def no_circle_overlap(self, circles):
+        """Check if any circle in the list overlaps with another."""
+        for i in range(len(circles)):
+            if self.does_circle_overlap(circles[i+1:], circles[i][0], circles[i][1], circles[i][2]):
+                return False
+        return True
+    
+    def does_circle_overlap(self, circles, x, y, r):
+        """Check if a circle overlaps with any in the list."""
+        if not self.is_torch:
+            for (cx, cy, cr) in circles:
+                if np.sqrt((x - cx)**2 + (y - cy)**2) <= r + cr:
+                    return True, (cx, cy, cr)
+            return False
+        else:
+            for (cx, cy, cr) in circles:
+                if torch.sqrt((x - cx)**2 + (y - cy)**2) <= r + cr:
+                    return True, (cx, cy, cr)
+            return False            
+
+
+    def place_spheres_on_plane(self, width, height, radius, max_attempts=1000):
+        """Try to place circles on a 2D plane."""
+        placed_circles = []
+
+        for r in radius:
+            placed = False
+            for _ in range(max_attempts):
+                x = np.random.uniform(r, width - r) if self.is_torch == False else torch.rand(1).to(self.torch_device) * (width - 2*r) + r
+                y = np.random.uniform(r, height - r) if self.is_torch == False else torch.rand(1).to(self.torch_device) * (height - 2*r) + r
+            
+                if not self.does_circle_overlap(placed_circles, x , y , r):
+                    placed_circles.append((x, y, r))
+                    placed = True
+                    print(f"Placed circle with rad {r}, and center ({x}, {y})")
+                    break
+        
+            if not placed:
+                print(f"Failed to place circle with rad {r}")
+                continue
+
+        placed_circles = np.array(placed_circles) if not self.is_torch else torch.tensor(placed_circles).to(self.torch_device)
+
+        circles = placed_circles[:, :2].to(self.torch_device)
+        radius = placed_circles[:, 2].to(self.torch_device)
+        return circles, radius
+
+    def create_mask(self, radius = None):
+        if radius is not None:
+            self.radius = radius
+        self.check_asserts()                
+        locs_res = self.loc.to(self.torch_device) * (1/self.feature_size[0])
+        radius_res = self.radius.to(self.torch_device) * (1/self.feature_size[0]) 
+        height = self.create_height_map(radius_res, locs_res).to(self.torch_device)
+
+        self.phi = (height * (self.refractive_index - 1) * 2 * np.pi / self.wavelength) if not self.is_torch else (height * (self.refractive_index - 1) * 2 * torch.pi / self.wavelength).to(self.torch_device)
+
+        self.mask = np.exp(1j * self.phi) if not self.is_torch else torch.exp(1j * self.phi).to(self.torch_device)
+
+    def create_height_map(self, radius, locs):
+        height = np.full((self.resolution[0], self.resolution[1]), self.min_height) if not self.is_torch else torch.full((self.resolution[0], self.resolution[1]), self.min_height).to(self.torch_device)
+
+        x = np.arange(self.resolution[0]) if not self.is_torch else torch.arange(self.resolution[0]).to(self.torch_device)
+        y = np.arange(self.resolution[1]) if not self.is_torch else torch.arange(self.resolution[1]).to(self.torch_device)
+        X, Y = np.meshgrid(x, y) if not self.is_torch else torch.meshgrid(x, y)
+        if self.is_torch:
+            X = X.to(self.torch_device)
+            Y = Y.to(self.torch_device)
+        for idx, rad in enumerate(radius):
+            contribution = self.lens_contribution(X, Y, rad, locs[idx]).to(self.torch_device) * self.feature_size[0]
+            contribution[(X - locs[idx][1])**2 + (Y - locs[idx][0])**2 > rad**2] = 0
+            height = height + contribution
+        assert np.all(height >= self.min_height) if not self.is_torch else torch.all(torch.ge(height, self.min_height))
+        return height
+    
+    def lens_contribution(self, x, y, radius, loc):
+        return np.sqrt(radius**2 - (x - loc[1])**2 - (y - loc[0])**2) if not self.is_torch else torch.sqrt(radius**2 - (x - loc[1])**2 - (y - loc[0])**2)
+
 
 class PhaseContour(Mask):
     """
@@ -361,33 +560,33 @@ def create_mask(self):
         """
         Creating phase contour from edges of Perlin noise.
         """
-
-        # Creating Perlin noise
-        proper_dim_1 = (self.resolution[0] // self.noise_period[0]) * self.noise_period[0]
-        proper_dim_2 = (self.resolution[1] // self.noise_period[1]) * self.noise_period[1]
-        noise = generate_perlin_noise_2d((proper_dim_1, proper_dim_2), self.noise_period)
-
-        # Upscaling to correspond to sensor size
-        if np.any(self.resolution != noise.shape):
-            noise = resize(noise[:, :, np.newaxis], shape=tuple(self.resolution) + (1,)).squeeze()
-
-        # Edge detection
-        binary = np.clip(np.round(np.interp(noise, (-1, 1), (0, 1))), a_min=0, a_max=1)
-        self.target_psf = cv.Canny(np.interp(binary, (-1, 1), (0, 255)).astype(np.uint8), 0, 255)
-
-        # Computing mask and height map
-        phase_mask, height_map = phase_retrieval(
-            target_psf=self.target_psf,
-            wv=self.design_wv,
-            d1=self.feature_size,
-            dz=self.distance_sensor,
-            n=self.refractive_index,
-            n_iter=self.n_iter,
-            height_map=True,
-        )
-        self.height_map = height_map
-        self.phase_pattern = phase_mask
-        self.mask = np.exp(1j * phase_mask)
+        if not (torch_available and isinstance(self.mask, torch.Tensor)):
+            # Creating Perlin noise
+            proper_dim_1 = (self.resolution[0] // self.noise_period[0]) * self.noise_period[0]
+            proper_dim_2 = (self.resolution[1] // self.noise_period[1]) * self.noise_period[1]
+            noise = generate_perlin_noise_2d((proper_dim_1, proper_dim_2), self.noise_period)
+
+            # Upscaling to correspond to sensor size
+            if np.any(self.resolution != noise.shape):
+                noise = resize(noise[:, :, np.newaxis], shape=tuple(self.resolution) + (1,)).squeeze()
+
+            # Edge detection
+            binary = np.clip(np.round(np.interp(noise, (-1, 1), (0, 1))), a_min=0, a_max=1)
+            self.target_psf = cv.Canny(np.interp(binary, (-1, 1), (0, 255)).astype(np.uint8), 0, 255)
+
+            # Computing mask and height map
+            phase_mask, height_map = phase_retrieval(
+                target_psf=self.target_psf,
+                wv=self.design_wv,
+                d1=self.feature_size,
+                dz=self.distance_sensor,
+                n=self.refractive_index,
+                n_iter=self.n_iter,
+                height_map=True,
+            )
+            self.height_map = height_map
+            self.phase_pattern = phase_mask
+            self.mask = np.exp(1j * phase_mask)
 
 
 def phase_retrieval(target_psf, wv, d1, dz, n=1.2, n_iter=10, height_map=False):
@@ -401,7 +600,7 @@ def phase_retrieval(target_psf, wv, d1, dz, n=1.2, n_iter=10, height_map=False):
         Target PSF to optimize the phase mask for.
     wv: float
         Wavelength (m).
-    d1: float
+    d1: float=
         Sample period on the sensor i.e. pixel size (m).
     dz: float
         Propagation distance between the mask and the sensor.
@@ -410,8 +609,10 @@ def phase_retrieval(target_psf, wv, d1, dz, n=1.2, n_iter=10, height_map=False):
     n_iter: int
         Number of iterations. Default value is 10.
     """
+
     M_p = np.sqrt(target_psf)
 
+
     if hasattr(d1, "__len__"):
         if d1[0] != d1[1]:
             warnings.warn("Non-square pixel, first dimension taken as feature size.")
@@ -419,18 +620,18 @@ def phase_retrieval(target_psf, wv, d1, dz, n=1.2, n_iter=10, height_map=False):
 
     for _ in range(n_iter):
         # back propagate from sensor to mask
-        M_phi = fresnel_conv(M_p, wv, d1, -dz, dtype=np.float32)[0]
+        M_phi = fresnel_conv(M_p, wv, d1, -dz, dtype=torch.float32)[0]
         # constrain amplitude at mask to be unity, i.e. phase pattern
-        M_phi = np.exp(1j * np.angle(M_phi))
+        M_phi = torch.exp(1j * torch.angle(M_phi))
         # forward propagate from mask to sensor
-        M_p = fresnel_conv(M_phi, wv, d1, dz, dtype=np.float32)[0]
+        M_p = fresnel_conv(M_phi, wv, d1, dz, dtype=torch.float32)[0]
         # constrain amplitude to be sqrt(PSF)
-        M_p = np.sqrt(target_psf) * np.exp(1j * np.angle(M_p))
+        M_p = torch.sqrt(target_psf) * torch.exp(1j * torch.angle(M_p))
 
-    phi = (np.angle(M_phi) + 2 * np.pi) % (2 * np.pi)
+    phi = (torch.angle(M_phi) + 2 * torch.pi) % (2 * torch.pi)
 
     if height_map:
-        return phi, wv * phi / (2 * np.pi * (n - 1))
+        return phi, wv * phi / (2 * torch.pi * (n - 1))
     else:
         return phi
 
@@ -470,3 +671,92 @@ def create_mask(self):
         radius_px = self.radius / self.feature_size[0]
         mask = 0.5 * (1 + np.cos(np.pi * (x**2 + y**2) / radius_px**2))
         self.mask = np.round(mask)
+
+
+class HeightVarying(Mask):
+    """
+    A class representing a height-varying mask for lensless imaging.
+
+    Parameters
+    ----------
+    refractive_index : float, optional
+        The refractive index of the material. Default is 1.2.
+    wavelength : float, optional
+        The wavelength of the light. Default is 532e-9.
+    height_map : ndarray or None, optional
+        An array representing the height map of the mask. If None, a random height map is generated.
+    height_range : tuple, optional
+        A tuple (min, max) specifying the range of heights when generating a random height map.
+        Default is (min, max), where min and max are placeholders for the actual values.
+    seed : int, optional
+        Seed for the random number generator when generating a random height map. Default is 0.
+
+    Example
+    -------
+    Creating an instance with a custom height map:
+
+    >>> custom_height_map = np.array([0.1, 0.2, 0.3])
+    >>> height_varying_instance = HeightVarying(
+    ...     refractive_index=1.2,
+    ...     wavelength=532e-9,
+    ...     height_map=custom_height_map,
+    ...     height_range=(0.0, 1.0),
+    ...     seed=42
+    ... )
+    """
+    def __init__(
+            self, 
+
+            refractive_index = 1.2, 
+            design_wv = 532e-9, 
+            height_map = None,
+            height_range = (1e-5, 1e-3), 
+            seed = 0,
+            **kwargs):
+        
+
+        self.refractive_index = refractive_index
+        self.wavelength = design_wv
+        self.height_range = height_range
+        self.seed = seed
+
+
+        if height_map is not None:
+            self.height_map = height_map
+        else:
+            self.height_map = None
+            
+
+        super().__init__(**kwargs)
+
+    def get_phi(self):
+        if self.is_torch == False:
+            phi = self.height_map * (2*np.pi*(self.refractive_index-1) / self.wavelength)
+            #phi = phi % (2*np.pi)
+            return phi
+        else:
+            phi = self.height_map * (2 * torch.pi * (self.refractive_index - 1) / self.wavelength)
+            return phi
+        
+    def create_mask(self, height_map=None):
+        if height_map is not None:
+            self.height_map = height_map
+        if not self.is_torch:
+            if self.height_map is None:
+                np.random.seed(self.seed)
+                self.height_map = np.random.uniform(self.height_range[0], self.height_range[1], self.resolution)
+            assert self.height_map.shape == tuple(self.resolution)
+            phase_mask = self.get_phi()
+            self.mask = np.exp(1j * phase_mask)
+        
+        else:
+            if self.height_map is None:
+                torch.manual_seed(self.seed)
+                height_range_tensor = torch.tensor(self.height_range).to(self.torch_device)
+                # Generate a random height map using PyTorch
+                resolution = torch.tensor(self.resolution).to(self.torch_device)
+                self.height_map = torch.rand((resolution[0], resolution[1])).to(self.torch_device) * (height_range_tensor[1] - height_range_tensor[0]) + height_range_tensor[0]
+            assert self.height_map.shape == tuple(self.resolution)
+            phase_mask = self.get_phi()
+            self.mask = torch.exp(1j * phase_mask).to(self.torch_device)
+    
\ No newline at end of file
diff --git a/lensless/hardware/sensor.py b/lensless/hardware/sensor.py
index 0785204e..08d8fa46 100644
--- a/lensless/hardware/sensor.py
+++ b/lensless/hardware/sensor.py
@@ -213,6 +213,7 @@ def from_name(cls, name, downsample=None):
             Sensor.
 
         """
+
         if name not in SensorOptions.values():
             raise ValueError(f"Sensor {name} not supported.")
         sensor_specs = sensor_dict[name].copy()
diff --git a/lensless/hardware/trainable_mask.py b/lensless/hardware/trainable_mask.py
index f0d258ba..3023daa3 100644
--- a/lensless/hardware/trainable_mask.py
+++ b/lensless/hardware/trainable_mask.py
@@ -7,11 +7,16 @@
 # #############################################################################
 
 import abc
+import omegaconf
+import os
+import numpy as np
+from hydra.utils import get_original_cwd
 import torch
-from lensless.utils.image import is_grayscale
-from lensless.hardware.slm import get_programmable_mask, get_intensity_psf
+from lensless.utils.image import is_grayscale, rgb2gray
+from lensless.hardware.slm import full2subpattern, get_programmable_mask, get_intensity_psf
 from lensless.hardware.sensor import VirtualSensor
 from waveprop.devices import slm_dict
+from lensless.hardware.mask import CodedAperture, MultiLensArray, HeightVarying
 
 
 class TrainableMask(torch.nn.Module, metaclass=abc.ABCMeta):
@@ -25,25 +30,30 @@ class TrainableMask(torch.nn.Module, metaclass=abc.ABCMeta):
 
     """
 
-    def __init__(self, initial_mask, optimizer="Adam", lr=1e-3, **kwargs):
+    def __init__(self, optimizer="Adam", lr=1e-3, **kwargs):
         """
         Base constructor. Derived constructor may define new state variables
 
         Parameters
         ----------
-        initial_mask : :py:class:`~torch.Tensor`
-            Initial mask parameters.
         optimizer : str, optional
             Optimizer to use for updating the mask parameters, by default "Adam"
         lr : float, optional
             Learning rate for the mask parameters, by default 1e-3
         """
         super().__init__()
-        self._mask = torch.nn.Parameter(initial_mask)
-        self._optimizer = getattr(torch.optim, optimizer)([self._mask], lr=lr)
-        self.train_mask_vals = True
+        # self._param = [torch.nn.Parameter(p, requires_grad=True) for p in initial_param]
+        # # self._param = initial_param
+        # self._optimizer = getattr(torch.optim, optimizer)(self._param, lr=lr)
+        # self._counter = 0
+        self._optimizer = optimizer
+        self._lr = lr
         self._counter = 0
 
+    def _set_optimizer(self, param):
+        """Set the optimizer for the mask parameters."""
+        self._optimizer = getattr(torch.optim, self._optimizer)(param, lr=self._lr)
+
     @abc.abstractmethod
     def get_psf(self):
         """
@@ -63,17 +73,119 @@ def update_mask(self):
         self.project()
         self._counter += 1
 
-    def get_vals(self):
-        """Get the mask parameters."""
-        return self._mask
-
     @abc.abstractmethod
     def project(self):
         """Abstract method for projecting the mask parameters to a valid space (should be a subspace of [0,1])."""
         raise NotImplementedError
+    
+
+class TrainableMultiLensArray(TrainableMask):
 
+    def __init__(
+        self, sensor_name, downsample=None, optimizer="Adam", lr=1e-3, torch_device="cuda", **kwargs
+    ):
+
+        # 1) call base constructor so parameters can be set
+        super().__init__(optimizer, lr, **kwargs)
+        self.device = torch_device
+
+        # 2) initialize mask
+        assert "distance_sensor" in kwargs, "Distance to sensor must be specified"
+        assert "N" in kwargs, "Number of Lenses must be specified"
+        self._mask_obj = MultiLensArray.from_sensor(sensor_name, downsample, is_torch=True, torch_device=torch_device, **kwargs)
+        self._mask = self._mask_obj.mask
+
+        # 3) set learnable parameters (should be immediate attributes of the class)
+        self._radius = torch.nn.Parameter(self._mask_obj.radius)
+        initial_param = [self._radius]
+
+        # 4) set optimizer
+        self._set_optimizer(initial_param)
+        
+        # 5) compute PSF
+        self._psf = None
+        self.project()
+
+    def get_psf(self):
+        return self._psf
+
+    
+    def project(self):
+        with torch.no_grad():
+            # clamp back the radiuses
+            rad = torch.clamp(self._radius.data, self._mask_obj.radius_range[0], self._mask_obj.radius_range[1])
+
+            # sort in descending order
+            rad, idx = torch.sort(rad, descending=True)
+            loca = self._mask_obj.loc[idx]
+            self._mask_obj.loc = loca
+
+            circles = torch.cat((loca, rad.unsqueeze(-1)), dim=-1)
+            for idx, r in enumerate(rad):
+                min_loc = torch.min(loca[idx, 0], loca[idx, 1])
+                rad[idx] = torch.clamp(r, 0, min_loc)
+                # check for overlapping
+                for (cx, cy, cr) in circles[idx+1:]:
+                    dist = torch.sqrt((loca[idx, 0] - cx)**2 + (loca[idx, 1] - cy)**2)
+                    if dist <= r + cr:
+                        rad[idx] = dist - cr
+                        circles[idx, 2] = rad[idx]
+                    if rad[idx] < 0:
+                        rad[idx] = 0
+                        circles[idx, 2] = rad[idx]
+                        break
+            # update the parameters
+            self._radius.data = rad
+        # recompute PSF
+        self._mask_obj.create_mask(self._radius)
+        self._mask_obj.compute_psf()
+        self._psf = self._mask_obj.psf.unsqueeze(0)
+        self._psf = self._psf / self._psf.norm()
+
+        
+                     
+class TrainableHeightVarying(TrainableMask):
+
+    def __init__(
+            self, sensor_name, downsample = None, optimizer="Adam", lr=1e-3, torch_device="cuda", **kwargs
+    ):
+        #1)
+        super().__init__(optimizer, lr, **kwargs)
+
+        #2)
+        assert "distance_sensor" in kwargs, "Distance to sensor must be specified"
+        self._mask_obj = HeightVarying.from_sensor(sensor_name, downsample, is_torch=True, torch_device=torch_device, **kwargs)
+        self._mask = self._mask_obj.mask
+
+        #3)
+        self._height_map = torch.nn.Parameter(self._mask_obj.height_map)
+        initial_param = [self._height_map]
+        
+        #4)
+        self._set_optimizer(initial_param)
+        
+         # 5) compute PSF
+        self._psf = None
+        self.project()
+        
+    def get_psf(self):
+        return self._psf
+
+    def project(self):
+        with torch.no_grad():
+            # clamp back the heights between min_height, and max_height
+            self._height_map.data = torch.clamp(self._height_map.data, self._mask_obj.height_range[0], self._mask_obj.height_range[1])
+        
+        self._mask_obj.create_mask(self._height_map)
+        self._mask_obj.compute_psf()
+        self._psf = self._mask_obj.psf.unsqueeze(0)
+        self._psf = self._psf / self._psf.norm()
+        
+
+            
 
 class TrainablePSF(TrainableMask):
+    # class TrainablePSF(torch.nn.Module, TrainableMask):
     """
     Class for defining an object that directly optimizes the PSF, without any constraints on what can be realized physically.
 
@@ -84,33 +196,45 @@ class TrainablePSF(TrainableMask):
         Otherwise PSF will be returned as RGB. By default False.
     """
 
-    def __init__(self, initial_mask, optimizer="Adam", lr=1e-3, grayscale=False, **kwargs):
-        super().__init__(initial_mask, optimizer, lr, **kwargs)
-        assert (
-            len(initial_mask.shape) == 4
-        ), "Mask must be of shape (depth, height, width, channels)"
+    def __init__(self, initial_psf, optimizer="Adam", lr=1e-3, grayscale=False, **kwargs):
+
+        # BEFORE
+        super().__init__(optimizer, lr, **kwargs)
+        self._psf = torch.nn.Parameter(initial_psf)
+        initial_param = [self._psf]
+        self._set_optimizer(initial_param)
+
+        # # cast as learnable parameters
+        # super().__init__()
+        # self._psf = torch.nn.Parameter(initial_psf)
+        # self._optimizer = getattr(torch.optim, optimizer)([self._psf], lr=lr)
+        # self._counter = 0
+
+        # checks
+        assert len(initial_psf.shape) == 4, "Mask must be of shape (depth, height, width, channels)"
         self.grayscale = grayscale
-        self._is_grayscale = is_grayscale(initial_mask)
+        self._is_grayscale = is_grayscale(initial_psf)
         if grayscale:
-            assert self._is_grayscale, "Mask must be grayscale"
+            assert self._is_grayscale, "PSF must be grayscale"
 
     def get_psf(self):
         if self._is_grayscale:
             if self.grayscale:
                 # simulation in grayscale
-                return self._mask
+                return self._psf
             else:
                 # replicate to 3 channels
-                return self._mask.expand(-1, -1, -1, 3)
+                return self._psf.expand(-1, -1, -1, 3)
         else:
             # assume RGB
-            return self._mask
+            return self._psf
 
     def project(self):
-        self._mask.data = torch.clamp(self._mask, 0, 1)
+        self._psf.data = torch.clamp(self._psf, 0, 1)
 
 
 class AdafruitLCD(TrainableMask):
+    # class AdafruitLCD(torch.nn.Module, TrainableMask):
     def __init__(
         self,
         initial_vals,
@@ -129,7 +253,7 @@ def __init__(
         mask2sensor=None,
         downsample=None,
         min_val=0,
-        **kwargs
+        **kwargs,
     ):
         """
         Parameters
@@ -146,23 +270,31 @@ def __init__(
             Whether to flip the mask vertically, by default False
         """
 
-        super().__init__(initial_vals, **kwargs)
+        super().__init__(optimizer, lr, **kwargs)  # when using TrainableMask init
+        # super().__init__()  # when using torch.nn.Module
+
         self.train_mask_vals = train_mask_vals
+        if train_mask_vals:
+            self._vals = torch.nn.Parameter(initial_vals)
+        else:
+            self._vals = initial_vals
+
         if color_filter is not None:
-            self.color_filter = torch.nn.Parameter(color_filter)
+            self._color_filter = torch.nn.Parameter(color_filter)
             if train_mask_vals:
-                param = [self._mask, self.color_filter]
+                initial_param = [self._vals, self._color_filter]
             else:
-                del self._mask
-                self._mask = initial_vals
-                param = [self.color_filter]
-            self._optimizer = getattr(torch.optim, optimizer)(param, lr=lr)
+                initial_param = [self._color_filter]
         else:
-            self.color_filter = None
             assert (
                 train_mask_vals
             ), "If color filter is not trainable, mask values must be trainable"
 
+        # set optimizer
+        # self._optimizer = getattr(torch.optim, optimizer)(initial_param, lr=lr)
+        # self._counter = 0
+        self._set_optimizer(initial_param)
+
         self.slm_param = slm_dict[slm]
         self.device = slm
         self.sensor = VirtualSensor.from_name(sensor, downsample=downsample)
@@ -185,12 +317,12 @@ def __init__(
     def get_psf(self):
 
         mask = get_programmable_mask(
-            vals=self._mask,
+            vals=self._vals,
             sensor=self.sensor,
             slm_param=self.slm_param,
             rotate=self.rotate,
             flipud=self.flipud,
-            color_filter=self.color_filter,
+            color_filter=self._color_filter,
         )
 
         if self.vertical_shift is not None:
@@ -223,10 +355,197 @@ def get_psf(self):
 
     def project(self):
         if self.train_mask_vals:
-            self._mask.data = torch.clamp(self._mask, self.min_val, 1)
-        if self.color_filter is not None:
-            self.color_filter.data = torch.clamp(self.color_filter, 0, 1)
+            self._vals.data = torch.clamp(self._vals, self.min_val, 1)
+        if self._color_filter is not None:
+            self._color_filter.data = torch.clamp(self._color_filter, 0, 1)
             # normalize each row to 1
-            self.color_filter.data = self.color_filter / self.color_filter.sum(
+            self._color_filter.data = self._color_filter / self._color_filter.sum(
                 dim=[1, 2]
             ).unsqueeze(-1).unsqueeze(-1)
+
+
+class TrainableCodedAperture(TrainableMask):
+    def __init__(
+        self,
+        sensor_name,
+        downsample=None,
+        binary=True,
+        torch_device="cuda",
+        optimizer="Adam",
+        lr=1e-3,
+        **kwargs,
+    ):
+        """
+        TODO: Distinguish between separable and non-separable.
+        """
+
+        # 1) call base constructor so parameters can be set
+        super().__init__(optimizer, lr, **kwargs)
+
+        # 2) initialize mask
+        assert "distance_sensor" in kwargs, "Distance to sensor must be specified"
+        assert "method" in kwargs, "Method must be specified."
+        assert "n_bits" in kwargs, "Number of bits must be specified."
+        # self._mask_obj = CodedAperture.from_sensor(sensor_name, downsample, is_torch=True, **kwargs)
+        self._mask_obj = CodedAperture.from_sensor(
+            sensor_name,
+            downsample,
+            is_torch=True,
+            torch_device=torch_device,
+            **kwargs,
+        )
+        self._mask = self._mask_obj.mask
+
+        # 3) set learnable parameters (should be immediate attributes of the class)
+        self._row = None
+        self._col = None
+        self._vals = None
+        if self._mask_obj.row is not None:
+            # seperable
+            self.separable = True
+            self._row = torch.nn.Parameter(self._mask_obj.row)
+            self._col = torch.nn.Parameter(self._mask_obj.col)
+            initial_param = [self._row, self._col]
+        else:
+            # non-seperable
+            self.separable = False
+            self._vals = torch.nn.Parameter(self._mask_obj.mask)
+            initial_param = [self._vals]
+        self.binary = binary
+
+        # 4) set optimizer
+        self._set_optimizer(initial_param)
+
+        # 5) compute PSF
+        self._psf = None
+        self.project()
+
+    def get_psf(self):
+        # self._mask_obj.create_mask(self._row, self._col)
+        # self._mask_obj.compute_psf()
+        # psf = self._mask_obj.psf.unsqueeze(0)
+
+        # # # need normalize the PSF? would think so but NAN comes up if included
+        # # psf = psf / psf.norm()
+
+        # return psf
+        return self._psf
+
+    def project(self):
+        with torch.no_grad():
+            if self.separable:
+                self._row.data = torch.clamp(self._row, 0, 1)
+                self._col.data = torch.clamp(self._col, 0, 1)
+                if self.binary:
+                    self._row.data = torch.round(self._row)
+                    self._col.data = torch.round(self._col)
+            else:
+                self._vals.data = torch.clamp(self._vals, 0, 1)
+                if self.binary:
+                    self._vals.data = torch.round(self._vals)
+
+        # recompute PSF
+        self._mask_obj.create_mask(self._row, self._col, mask=self._vals)
+        self._mask_obj.compute_psf()
+        self._psf = self._mask_obj.psf.unsqueeze(0)
+        self._psf = self._psf / self._psf.norm()
+
+
+"""
+Utilities to prepare trainable masks.
+"""
+
+trainable_mask_dict = {
+    "AdafruitLCD": AdafruitLCD,
+    "TrainablePSF": TrainablePSF,
+    "TrainableCodedAperture": TrainableCodedAperture,
+    "TrainableHeightVarying": TrainableHeightVarying,
+    "TrainableMultiLensArray": TrainableMultiLensArray,
+}
+
+
+def prep_trainable_mask(config, psf=None, downsample=None):
+    mask = None
+    color_filter = None
+    downsample = config.files.downsample if downsample is None else downsample
+    if config.trainable_mask.mask_type is not None:
+
+        assert config.trainable_mask.mask_type in trainable_mask_dict.keys(), (
+            f"Trainable mask type {config.trainable_mask.mask_type} not supported. "
+            f"Supported types are {trainable_mask_dict.keys()}"
+        )
+        mask_class = trainable_mask_dict[config.trainable_mask.mask_type]
+
+        if isinstance(config.trainable_mask.initial_value, omegaconf.dictconfig.DictConfig):
+
+            # from mask config
+            mask = mask_class(
+                # mask = TrainableCodedAperture(
+                sensor_name=config.simulation.sensor,
+                downsample=downsample,
+                distance_sensor=config.simulation.mask2sensor,
+                optimizer=config.trainable_mask.optimizer,
+                lr=config.trainable_mask.mask_lr,
+                binary=config.trainable_mask.binary,
+                torch_device=config.torch_device,
+                **config.trainable_mask.initial_value,
+            )
+
+        else:
+
+            if config.trainable_mask.initial_value == "random":
+                if psf is not None:
+                    initial_mask = torch.rand_like(psf)
+                else:
+                    sensor = VirtualSensor.from_name(
+                        config.simulation.sensor, downsample=downsample
+                    )
+                    resolution = sensor.resolution
+                    initial_mask = torch.rand((1, *resolution, 3))
+            elif config.trainable_mask.initial_value == "psf":
+                initial_mask = psf.clone()
+            # if file ending with "npy"
+            elif config.trainable_mask.initial_value.endswith("npy"):
+                pattern = np.load(
+                    os.path.join(config.trainable_mask.initial_value) #TODO: get_original_cwd(), 
+                )
+
+                initial_mask = full2subpattern(
+                    pattern=pattern,
+                    shape=config.trainable_mask.ap_shape,
+                    center=config.trainable_mask.ap_center,
+                    slm=config.trainable_mask.slm,
+                )
+                initial_mask = torch.from_numpy(initial_mask.astype(np.float32))
+
+                # prepare color filter if needed
+                from waveprop.devices import slm_dict
+                from waveprop.devices import SLMParam as SLMParam_wp
+
+                slm_param = slm_dict[config.trainable_mask.slm]
+                if (
+                    config.trainable_mask.train_color_filter
+                    and SLMParam_wp.COLOR_FILTER in slm_param.keys()
+                ):
+                    color_filter = slm_param[SLMParam_wp.COLOR_FILTER]
+                    color_filter = torch.from_numpy(color_filter.copy()).to(dtype=torch.float32)
+
+                    # add small random values
+                    color_filter = color_filter + 0.1 * torch.rand_like(color_filter)
+
+            else:
+                raise ValueError(
+                    f"Initial PSF value {config.trainable_mask.initial_value} not supported"
+                )
+
+            if config.trainable_mask.grayscale and not is_grayscale(initial_mask):
+                initial_mask = rgb2gray(initial_mask)
+
+            mask = mask_class(
+                initial_mask,
+                downsample=downsample,
+                color_filter=color_filter,
+                **config.trainable_mask,
+            )
+
+    return mask
diff --git a/lensless/recon/utils.py b/lensless/recon/utils.py
index 53f23a1b..a980a2f4 100644
--- a/lensless/recon/utils.py
+++ b/lensless/recon/utils.py
@@ -415,6 +415,7 @@ def __init__(
                 import lpips
 
                 self.Loss_lpips = lpips.LPIPS(net="vgg").to(self.device)
+                
             except ImportError:
                 return ImportError(
                     "lpips package is need for LPIPS loss. Install using : pip install lpips"
@@ -426,7 +427,7 @@ def __init__(
         self.clip_grad_norm = clip_grad
         self.optimizer_config = optimizer
         self.set_optimizer()
-
+        
         self.metrics = {
             "LOSS": [],  # train loss
             "MSE": [],
@@ -447,10 +448,10 @@ def __init__(
         if metric_for_best_model is not None:
             assert metric_for_best_model in self.metrics.keys()
         self.save_every = save_every
-
+        
         # Backward hook that detect NAN in the gradient and print the layer weights
         if not self.skip_NAN:
-
+        
             def detect_nan(grad):
                 if torch.isnan(grad).any():
                     if self.logger:
@@ -471,11 +472,15 @@ def detect_nan(grad):
 
     def set_optimizer(self, last_epoch=-1):
 
-        if self.optimizer_config.type == "Adam":
-            parameters = [{"params": self.recon.parameters()}]
-            self.optimizer = torch.optim.Adam(parameters, lr=self.optimizer_config.lr)
-        else:
-            raise ValueError(f"Unsupported optimizer : {self.optimizer_config.type}")
+        # if self.optimizer_config.type == "Adam":
+        #     parameters = [{"params": self.recon.parameters()}]
+        #     self.optimizer = torch.optim.Adam(parameters, lr=self.optimizer_config.lr)
+        # else:
+        #     raise ValueError(f"Unsupported optimizer : {self.optimizer_config.type}")
+        parameters = [{"params": self.recon.parameters()}]
+        self.optimizer = getattr(torch.optim, self.optimizer_config.type)(
+            parameters, lr=self.optimizer_config.lr
+        )
 
         # Scheduler
         if self.optimizer_config.slow_start:
@@ -533,9 +538,11 @@ def train_epoch(self, data_loader):
             X = X.to(self.device)
             y = y.to(self.device)
 
-            # update psf according to mask
-            if self.use_mask:
-                self.recon._set_psf(self.mask.get_psf().to(self.device))
+            # BEFORE
+            # # update psf according to mask
+            # if self.use_mask:
+            #     new_psf = self.mask.get_psf().to(self.device)
+            #     self.recon._set_psf(new_psf)
 
             # forward pass
             y_pred = self.recon.batch_call(X.to(self.device))
@@ -548,7 +555,9 @@ def train_epoch(self, data_loader):
             y_max = torch.amax(y, dim=(-1, -2, -3), keepdim=True) + eps
             y = y / y_max
 
-            self.optimizer.zero_grad(set_to_none=True)
+            # BEFORE
+            # self.optimizer.zero_grad(set_to_none=True)
+
             # convert to CHW for loss and remove depth
             y_pred = y_pred.reshape(-1, *y_pred.shape[-3:]).movedim(-1, -3)
             y = y.reshape(-1, *y.shape[-3:]).movedim(-1, -3)
@@ -579,28 +588,53 @@ def train_epoch(self, data_loader):
                     self.Loss_lpips(2 * y_pred - 1, 2 * y - 1)
                 )
             if self.use_mask and self.l1_mask:
-                loss_v = loss_v + self.l1_mask * torch.mean(torch.abs(self.mask._mask))
+                for p in self.mask.parameters():
+                    if p.requires_grad:
+                        loss_v = loss_v + self.l1_mask * torch.mean(torch.abs(p))
             loss_v.backward()
 
+            # check mask parameters are learning
+            if self.use_mask:
+                for p in self.mask.parameters():
+                    assert p.grad is not None
+
             if self.clip_grad_norm is not None:
+                torch.nn.utils.clip_grad_norm_(self.mask.parameters(), self.clip_grad_norm)
                 torch.nn.utils.clip_grad_norm_(self.recon.parameters(), self.clip_grad_norm)
 
             # if any gradient is NaN, skip training step
             if self.skip_NAN:
-                is_NAN = False
+                recon_is_NAN = False
+                mask_is_NAN = False
                 for param in self.recon.parameters():
                     if param.grad is not None and torch.isnan(param.grad).any():
-                        is_NAN = True
+                        recon_is_NAN = True
+                        break
+                for param in self.mask.parameters():
+                    if param.grad is not None and torch.isnan(param.grad).any():
+                        mask_is_NAN = True
                         break
-                if is_NAN:
-                    self.print("NAN detected in gradiant, skipping training step")
+                if recon_is_NAN or mask_is_NAN:
+                    if recon_is_NAN:
+                        self.print(
+                            "NAN detected in reconstruction gradient, skipping training step"
+                        )
+                    if mask_is_NAN:
+                        self.print("NAN detected in mask gradient, skipping training step")
                     i += 1
                     continue
+
             self.optimizer.step()
 
+            # NEW
+            self.optimizer.zero_grad(set_to_none=True)
+
             # update mask
             if self.use_mask:
                 self.mask.update_mask()
+                # NEW
+                self.train_dataloader.dataset.set_psf()
+                self.recon._set_psf(self.mask._psf)
 
             mean_loss += (loss_v.item() - mean_loss) * (1 / i)
             pbar.set_description(f"loss : {mean_loss}")
@@ -626,6 +660,11 @@ def evaluate(self, mean_loss, save_pt, epoch, disp=None):
         if self.test_dataset is None:
             return
 
+        # NEW
+        if self.use_mask:
+            with torch.no_grad():
+                self.test_dataset.set_psf()
+
         output_dir = None
         if disp is not None:
             output_dir = os.path.join("eval_recon")
@@ -642,7 +681,7 @@ def evaluate(self, mean_loss, save_pt, epoch, disp=None):
             output_dir=output_dir,
             crop=self.crop,
         )
-
+        
         # update metrics with current metrics
         self.metrics["LOSS"].append(mean_loss)
         for key in current_metrics:
@@ -659,7 +698,10 @@ def evaluate(self, mean_loss, save_pt, epoch, disp=None):
             if self.lpips is not None:
                 eval_loss += self.lpips * current_metrics["LPIPS_Vgg"]
             if self.use_mask and self.l1_mask:
-                eval_loss += self.l1_mask * np.mean(np.abs(self.mask._mask.cpu().detach().numpy()))
+                for p in self.mask.parameters():
+                    if p.requires_grad:
+                        eval_loss += self.l1_mask * np.mean(np.abs(p.cpu().detach().numpy()))
+                # eval_loss += self.l1_mask * np.mean(np.abs(self.mask._mask.cpu().detach().numpy()))
             return eval_loss
         else:
             return current_metrics[self.metrics["metric_for_best_model"]]
@@ -721,8 +763,8 @@ def train(self, n_epoch=1, save_pt=None, disp=None):
         """
 
         start_time = time.time()
-
-        self.evaluate(-1, save_pt, epoch=0, disp=disp)
+        #self.evaluate(-1, save_pt, epoch=0, disp=disp)
+        
         for epoch in range(n_epoch):
 
             # add extra components (if specified)
@@ -771,23 +813,18 @@ def save(self, epoch, path="recon", include_optimizer=False):
         # create directory if it does not exist
         if not os.path.exists(path):
             os.makedirs(path)
-        # save mask
+
+        # save mask parameters
         if self.use_mask:
-            # torch.save(self.mask._mask, os.path.join(path, f"mask_epoch{epoch}.pt"))
 
-            # save mask as numpy array
-            if self.mask.train_mask_vals:
-                np.save(
-                    os.path.join(path, f"mask_epoch{epoch}.npy"),
-                    self.mask._mask.cpu().detach().numpy(),
-                )
+            for name, param in self.mask.named_parameters():
 
-            if self.mask.color_filter is not None:
-                # save save numpy array
-                np.save(
-                    os.path.join(path, f"mask_color_filter_epoch{epoch}.npy"),
-                    self.mask.color_filter.cpu().detach().numpy(),
-                )
+                # save as numpy array
+                if param.requires_grad:
+                    np.save(
+                        os.path.join(path, f"mask{name}_epoch{epoch}.npy"),
+                        param.cpu().detach().numpy(),
+                    )
 
             torch.save(
                 self.mask._optimizer.state_dict(), os.path.join(path, f"mask_optim_epoch{epoch}.pt")
@@ -802,5 +839,6 @@ def save(self, epoch, path="recon", include_optimizer=False):
         # save optimizer
         if include_optimizer:
             torch.save(self.optimizer.state_dict(), os.path.join(path, f"optim_epoch{epoch}.pt"))
+
         # save recon
         torch.save(self.recon.state_dict(), os.path.join(path, f"recon_epoch{epoch}"))
diff --git a/lensless/utils/dataset.py b/lensless/utils/dataset.py
index 6cd20cd0..3b783866 100644
--- a/lensless/utils/dataset.py
+++ b/lensless/utils/dataset.py
@@ -825,13 +825,26 @@ def __init__(
 
         super(SimulatedDatasetTrainableMask, self).__init__(dataset, simulator, **kwargs)
 
-    def _get_images_pair(self, index):
-        # update psf
-        psf = self._mask.get_psf()
+    def set_psf(self, psf=None):
+        """
+        Set the PSF of the simulator.
+
+        Parameters
+        ----------
+        psf : :py:class:`torch.Tensor`, optional
+            PSF to use for the simulation. If ``None``, the PSF of the mask is used.
+        """
+        if psf is None:
+            psf = self._mask.get_psf()
         self.sim.set_point_spread_function(psf)
 
-        # return simulated images
-        return super()._get_images_pair(index)
+    # def _get_images_pair(self, index):
+    #     # update psf
+    #     psf = self._mask.get_psf()
+    #     self.sim.set_point_spread_function(psf)
+
+    #     # return simulated images
+    #     return super()._get_images_pair(index)
 
 
 class HITLDatasetTrainableMask(SimulatedDatasetTrainableMask):
diff --git a/lensless/utils/io.py b/lensless/utils/io.py
index 750b0e0e..cd728ae5 100644
--- a/lensless/utils/io.py
+++ b/lensless/utils/io.py
@@ -540,7 +540,11 @@ def save_image(img, fp, max_val=255):
         img_tmp *= max_val
         img_tmp = img_tmp.astype(np.uint8)
 
-    img_tmp = Image.fromarray(img_tmp)
+    # RGB
+    if len(img_tmp.shape) == 3 and img_tmp.shape[2] == 3:
+        img_tmp = Image.fromarray(img_tmp)
+    else:
+        img_tmp = Image.fromarray(img_tmp.squeeze())
     img_tmp.save(fp)
 
 
diff --git a/scripts/recon/train_unrolled.py b/scripts/recon/train_unrolled.py
index eaace9a8..2f11df99 100644
--- a/scripts/recon/train_unrolled.py
+++ b/scripts/recon/train_unrolled.py
@@ -39,6 +39,7 @@
 import numpy as np
 import time
 from lensless import UnrolledFISTA, UnrolledADMM
+from lensless.hardware.trainable_mask import prep_trainable_mask
 from lensless.utils.dataset import (
     DiffuserCamMirflickr,
     SimulatedFarFieldDataset,
@@ -47,9 +48,6 @@
     HITLDatasetTrainableMask,
 )
 from torch.utils.data import Subset
-import lensless.hardware.trainable_mask
-from lensless.hardware.slm import full2subpattern
-from lensless.hardware.sensor import VirtualSensor
 from lensless.recon.utils import create_process_network
 from lensless.utils.image import rgb2gray, is_grayscale
 from lensless.utils.simulation import FarFieldSimulator
@@ -66,11 +64,15 @@
 log = logging.getLogger(__name__)
 
 
+
 def simulate_dataset(config, generator=None):
 
-    if config.torch_device == "cuda" and torch.cuda.is_available():
-        device = "cuda"
+    if "cuda" in config.torch_device and torch.cuda.is_available():
+        # if config.torch_device == "cuda" and torch.cuda.is_available():
+        log.info("Using GPU for training.")
+        device = config.torch_device
     else:
+        log.info("Using CPU for training.")
         device = "cpu"
 
     # -- prepare PSF
@@ -107,6 +109,7 @@ def simulate_dataset(config, generator=None):
         transform = transforms.Compose(transforms_list)
         train_ds = datasets.MNIST(root=data_path, train=True, download=True, transform=transform)
         test_ds = datasets.MNIST(root=data_path, train=False, download=True, transform=transform)
+
     elif config.files.dataset == "fashion_mnist":
         transform = transforms.Compose(transforms_list)
         train_ds = datasets.FashionMNIST(
@@ -119,6 +122,7 @@ def simulate_dataset(config, generator=None):
         transform = transforms.Compose(transforms_list)
         train_ds = datasets.CIFAR10(root=data_path, train=True, download=True, transform=transform)
         test_ds = datasets.CIFAR10(root=data_path, train=False, download=True, transform=transform)
+
     elif config.files.dataset == "CelebA":
         root = config.files.celeba_root
         data_path = os.path.join(root, "celeba")
@@ -144,6 +148,13 @@ def simulate_dataset(config, generator=None):
     else:
         raise NotImplementedError(f"Dataset {config.files.dataset} not implemented.")
 
+    if config.files.dataset != "CelebA":
+        if config.files.n_files is not None:
+            train_size = int((1 - config.files.test_size) * config.files.n_files)
+            test_size = config.files.n_files - train_size
+            train_ds = Subset(train_ds, np.arange(train_size))
+            test_ds = Subset(test_ds, np.arange(test_size))
+
     # convert PSF
     if config.simulation.grayscale and not is_grayscale(psf):
         psf = rgb2gray(psf)
@@ -264,70 +275,9 @@ def simulate_dataset(config, generator=None):
     return train_ds_prop, test_ds_prop, mask
 
 
-def prep_trainable_mask(config, psf=None, downsample=None):
-    mask = None
-    color_filter = None
-    downsample = config.files.downsample if downsample is None else downsample
-    if config.trainable_mask.mask_type is not None:
-        mask_class = getattr(lensless.hardware.trainable_mask, config.trainable_mask.mask_type)
-
-        if config.trainable_mask.initial_value == "random":
-            if psf is not None:
-                initial_mask = torch.rand_like(psf)
-            else:
-                sensor = VirtualSensor.from_name(config.simulation.sensor, downsample=downsample)
-                resolution = sensor.resolution
-                initial_mask = torch.rand((1, *resolution, 3))
-        elif config.trainable_mask.initial_value == "psf":
-            initial_mask = psf.clone()
-        # if file ending with "npy"
-        elif config.trainable_mask.initial_value.endswith("npy"):
-            pattern = np.load(os.path.join(get_original_cwd(), config.trainable_mask.initial_value))
-
-            initial_mask = full2subpattern(
-                pattern=pattern,
-                shape=config.trainable_mask.ap_shape,
-                center=config.trainable_mask.ap_center,
-                slm=config.trainable_mask.slm,
-            )
-            initial_mask = torch.from_numpy(initial_mask.astype(np.float32))
-
-            # prepare color filter if needed
-            from waveprop.devices import slm_dict
-            from waveprop.devices import SLMParam as SLMParam_wp
-
-            slm_param = slm_dict[config.trainable_mask.slm]
-            if (
-                config.trainable_mask.train_color_filter
-                and SLMParam_wp.COLOR_FILTER in slm_param.keys()
-            ):
-                color_filter = slm_param[SLMParam_wp.COLOR_FILTER]
-                color_filter = torch.from_numpy(color_filter.copy()).to(dtype=torch.float32)
-
-                # add small random values
-                color_filter = color_filter + 0.1 * torch.rand_like(color_filter)
-        else:
-            raise ValueError(
-                f"Initial PSF value {config.trainable_mask.initial_value} not supported"
-            )
-
-        if config.trainable_mask.grayscale and not is_grayscale(initial_mask):
-            initial_mask = rgb2gray(initial_mask)
 
-        mask = mask_class(
-            initial_mask,
-            optimizer="Adam",
-            downsample=downsample,
-            color_filter=color_filter,
-            **config.trainable_mask,
-        )
-
-    return mask
-
-
-@hydra.main(version_base=None, config_path="../../configs", config_name="train_unrolledADMM")
+@hydra.main(version_base=None, config_path="../../configs", config_name="train_multilens_array")
 def train_unrolled(config):
-
     # set seed
     seed = config.seed
     torch.manual_seed(seed)
@@ -346,9 +296,10 @@ def train_unrolled(config):
     if save:
         save = os.getcwd()
 
-    if config.torch_device == "cuda" and torch.cuda.is_available():
+    if "cuda" in config.torch_device and torch.cuda.is_available():
+        # if config.torch_device == "cuda" and torch.cuda.is_available():
         log.info("Using GPU for training.")
-        device = "cuda"
+        device = config.torch_device
     else:
         log.info("Using CPU for training.")
         device = "cpu"
@@ -615,3 +566,4 @@ def train_unrolled(config):
 
 if __name__ == "__main__":
     train_unrolled()
+    
\ No newline at end of file
diff --git a/test/test_masks.py b/test/test_masks.py
index a16659d6..c64b45a6 100644
--- a/test/test_masks.py
+++ b/test/test_masks.py
@@ -1,8 +1,10 @@
 import numpy as np
-from lensless.hardware.mask import CodedAperture, PhaseContour, FresnelZoneAperture
+from lensless.hardware.mask import CodedAperture, PhaseContour, FresnelZoneAperture, HeightVarying, MultiLensArray
 from lensless.eval.metric import mse, psnr, ssim
 from waveprop.fresnel import fresnel_conv
-
+from matplotlib import pyplot as plt
+from lensless.hardware.trainable_mask import TrainableMask
+import torch 
 
 resolution = np.array([380, 507])
 d1 = 3e-6
@@ -34,7 +36,7 @@ def test_flatcam():
     desired_psf_shape = np.array(tuple(resolution) + (len(mask2.psf_wavelength),))
     assert np.all(mask2.psf.shape == desired_psf_shape)
 
-
+"""
 def test_phlatcam():
 
     mask = PhaseContour(
@@ -53,7 +55,7 @@ def test_phlatcam():
     assert mse(abs(Mp), np.sqrt(mask.target_psf)) < 0.1
     assert psnr(abs(Mp), np.sqrt(mask.target_psf)) > 30
     assert abs(1 - ssim(abs(Mp), np.sqrt(mask.target_psf), channel_axis=None)) < 0.1
-
+"""
 
 def test_fza():
 
@@ -69,14 +71,13 @@ def test_classmethod():
 
     downsample = 8
 
-    mask1 = CodedAperture.from_sensor(
+    """mask1 = CodedAperture.from_sensor(
         sensor_name="rpi_hq", downsample=downsample, distance_sensor=dz
     )
     assert np.all(mask1.mask.shape == resolution)
     desired_psf_shape = np.array(tuple(resolution) + (len(mask1.psf_wavelength),))
-    assert np.all(mask1.psf.shape == desired_psf_shape)
-
-    mask2 = PhaseContour.from_sensor(
+    assert np.all(mask1.psf.shape == desired_psf_shape)"""
+    """mask2 = PhaseContour.from_sensor(
         sensor_name="rpi_hq", downsample=downsample, distance_sensor=dz
     )
     assert np.all(mask2.mask.shape == resolution)
@@ -89,10 +90,56 @@ def test_classmethod():
     assert np.all(mask3.mask.shape == resolution)
     desired_psf_shape = np.array(tuple(resolution) + (len(mask3.psf_wavelength),))
     assert np.all(mask3.psf.shape == desired_psf_shape)
+    """
+    mask4 = MultiLensArray.from_sensor(
+        sensor_name="rpi_hq", downsample=downsample, distance_sensor=dz, N=10, is_Torch=True#radius=np.array([10, 25]), loc=np.array([[10.1, 11.3], [56.5, 89.2]])
+    )
+
+    phase = None
+    if not mask4.is_torch:
+        assert np.all(mask4.mask.shape == resolution)
+        desired_psf_shape = np.array(tuple(resolution) + (len(mask4.psf_wavelength),))
+        assert np.all(mask4.psf.shape == desired_psf_shape)
+        phase = mask4.phi
+    else:
+        # PyTorch operations
+        assert torch.equal(torch.tensor(mask4.mask.shape), torch.tensor(resolution))
+        desired_psf_shape = torch.tensor(tuple(resolution) + (len(mask4.psf_wavelength),))
+        assert torch.equal(torch.tensor(mask4.psf.shape), desired_psf_shape)
+        angle=torch.angle(mask4.mask).cpu().detach().numpy()
+    fig, ax = plt.subplots()
+    im = ax.imshow(mask4.phi, cmap="gray")
+    fig.colorbar(im, ax=ax, shrink=0.5, aspect=5)
+    plt.show()
+    """
+    
+    mask5 = HeightVarying.from_sensor(
+        sensor_name="rpi_hq", downsample=downsample, distance_sensor=dz, is_Torch=False
+    )
+    #assert mask5.is_Torch
+    if not mask5.is_torch:
+        # NumPy operations
+        assert np.all(mask5.mask.shape == resolution)
+        desired_psf_shape = np.array(tuple(resolution) + (len(mask5.psf_wavelength),))
+        assert np.all(mask5.psf.shape == desired_psf_shape)
+        fig, ax = plt.subplots()
+        im = ax.imshow(np.angle(mask5.mask), cmap="gray")
+        fig.colorbar(im, ax=ax, shrink=0.5, aspect=5)
+        plt.show()
+    else:
+        # PyTorch operations
+        assert torch.equal(torch.tensor(mask5.mask.shape), torch.tensor(resolution))
+        desired_psf_shape = torch.tensor(tuple(resolution) + (len(mask5.psf_wavelength),))
+        assert torch.equal(torch.tensor(mask5.psf.shape), desired_psf_shape)
+        fig, ax = plt.subplots()
+        im = ax.imshow(torch.angle(mask5.mask), cmap="gray")
+        fig.colorbar(im, ax=ax, shrink=0.5, aspect=5)
+        plt.show()"""
+    
 
 
 if __name__ == "__main__":
-    test_flatcam()
-    test_phlatcam()
-    test_fza()
+##    test_flatcam()
+##    test_phlatcam()
+##    test_fza()
     test_classmethod()