Symmetria

A Synthetic Dataset for Learning in Point Clouds

Project page of the Symmetria dataset.

Abstract

Unlike in image or text domains that benefit from an abundance of large-scale datasets, point cloud learning techniques frequently encounter limitations due to the scarcity of extensive datasets. To overcome this limitation, we present Symmetria, a formula-driven dataset that can be generated to any arbitrary size. By utilizing the concept of symmetry, we create shapes with known structure and high variability, enabling neural networks to effectively learn point cloud features. Our results demonstrate that this dataset is highly effective for point cloud self-supervised pre-training, yielding models with strong performance in downstream tasks such as classification and segmentation, which also show good few-shot learning capabilities. Additionally, our dataset can be effectively used to fine-tune models to classify real-world objects, highlighting the practical utility and application of our approach. We also introduce a challenging task for symmetry detection and provide a benchmark for baseline comparisons. A significant advantage of our approach is the public availability of the dataset, the accompanying code, and the ability to generate very large collections, promoting further research and innovation in point cloud learning.

Download

Here you can download the Symmetria dataset. The dataset is composed of four sub-datasets: easy, intermediate-1, intermediate-2, hard. We believe these dataset will be challenging for several years to come, so feel free to challenge your methods against them.

Symmetria-Easy [10k-100k]

7 classes: Astroid, Citrus, Egg of Keplero, Geometric Petal, Lemniscate of Bernoulli, m-Convexities and Mouth curve. Random rotations are applied with probability 0.5 and exclusively around the x-axis.

Symmetria-Intermediate-1 [10k-100k]

Two more classes w.r.t. Symmetria-Easy: Square and Cylinder (with the related conics)

Symmetria-Intermediate-2 [10k-100k]

Rotations have now a probability of 0.75 (instead of 0.5) and are around two axes (both x and y).

Symmetria-Hard [10k-100k]

One more class with just one axial symmetry: Revolution for a total of 10 classes. Rotations are now around all three axes with probability 1.0.

Symmetria-SSL [10k-50k]

The experimental and actual self-supervised pre-training versions of Symmetria. The size of Sym-SSL-50k was chosen to be as comparable as possible to ShapeNetCore, the basic version of ShapeNet containing symmetry-level annotations.

Ablation Study

In this section, we present a comprehensive collection of ablation study graphs which were excluded from the main manuscript to ensure brevity and enhance readability. This supplemental material is intended for researchers seeking a rigorous examination of the ablation results, offering deeper insights into the comparative performance and distinctions between the Symmetria Easy and Symmetria Hard datasets. All the data generated during these runs and the scripts used to create these graphs are available here.

Undersampling and noise

Single-class datasets with 2,000 samples, gaussian/uniform noise and undersampling transforms with varying probabilities, training on a standard PointNet encoder (24 M parameters)

Rotations - PointNet

Single-class datasets with 2,000 samples, rotations around x, x/y and x/y/z axes with varying probabilities, training on a standard PointNet encoder (24 M parameters)

Rotations - PointNeXt XXL

Single-class datasets with 2,000 samples, rotations around x, x/y and x/y/z axes with varying probabilities, training on a scaled-up PointNeXt encoder (73 M parameters)