File size: 7,579 Bytes

51a7af0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d0d46f8
51a7af0
 
 
 
 
 
 
 
0ad6753
51a7af0
 
 
d0d46f8
51a7af0
 
 
 
 
 
0ad6753
 
b28dbb6
51a7af0
 
0ad6753
 
 
 
 
51a7af0
 
 
 
 
 
 
f5b7849
51a7af0
d0d46f8
 
 
51a7af0
 
 
 
 
 
 
 
 
 
 
 
 
 
d0d46f8
0ad6753
 
f5b7849
51a7af0
 
 
 
 
 
 
 
 
d0d46f8
0ad6753
 
 
f5b7849
76530fd
0ad6753
 
 
76530fd
d0d46f8
0ad6753
 
 
0fd7552
d0d46f8
0ad6753
 
 
e48cd8b
d0d46f8
0ad6753
 
 
51a7af0
 
 
 
 
 
 
 
 
0ad6753
 
 
 
 
51a7af0

---
language: en
license: mit
tags:
- image-classification
- imagenet
- geometric-basin
- cantor-coherence
- multi-scale
- geofractaldavid
datasets:
- imagenet-1k
metrics:
- accuracy
library_name: pytorch
model-index:
- name: GeoFractalDavid-Basin-k12
  results:
  - task:
      type: image-classification
    dataset:
      name: ImageNet-1K
      type: imagenet-1k
    metrics:
    - type: accuracy
      value: 71.40
      name: Validation Accuracy
---

# GeoFractalDavid-Basin-k12: Geometric Basin Classification

**GeoFractalDavid** achieves classification through geometric compatibility rather than cross-entropy.
Features must "fit" geometric signatures: k-simplex shapes, Cantor positions, and hierarchical structure.

## 🎯 Performance

- **Best Validation Accuracy**: 71.40%
- **Epoch**: 10/10
- **Training Time**: 18m 45s

### Per-Scale Performance
- **Scale 384D**: 61.25%
- **Scale 512D**: 60.67%
- **Scale 768D**: 70.50%
- **Scale 1024D**: 51.69%
- **Scale 1280D**: 44.72%


## 🏗️ Architecture

**Model Type**: Multi-scale geometric basin classifier

**Core Components**:
- **Feature Dimension**: 512
- **Number of Classes**: 1000
- **k-Simplex Structure**: k=12 (13 vertices per class)
- **Scales**: [384, 512, 768, 1024, 1280]
- **Total Simplex Vertices**: 13,000

**Geometric Components**:
1. **Feature Similarity**: Cosine similarity to k-simplex centroids
2. **Cantor Coherence**: Distance to learned Cantor prototypes (alpha-normalized)
3. **Crystal Geometry**: Distance to nearest simplex vertex

Each scale learns to weight these components differently.

## 🔬 Learned Structure

### Alpha Convergence (Global Cantor Stairs)

The alpha parameter controls middle-interval weighting in the Cantor staircase.

- **Initial**: 0.3290
- **Final**: -0.0764
- **Change**: -0.4055
- **Converged to 0.5**: False

The Cantor staircase uses soft triadic decomposition with learnable alpha to map
features into [0,1] space with fractal structure.

### Cantor Prototype Distribution

Each class has a learned scalar Cantor prototype. The model pulls features toward
their class's Cantor position.

**Scale 384D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1894]

**Scale 512D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1377, 0.1895]

**Scale 768D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1373, 0.1897]

**Scale 1024D**:
- Mean: 0.0226
- Std: 0.0784
- Range: [-0.1375, 0.1896]

**Scale 1280D**:
- Mean: 0.0227
- Std: 0.0784
- Range: [-0.1375, 0.1898]


Most classes cluster around 0.5 (middle Cantor region), with smooth spread across [0,1].
This creates a continuous manifold rather than discrete bins.

### Geometric Weight Evolution

Each scale learns optimal weights for combining geometric components:

**Scale 384D**: Feature=0.929, Cantor=0.020, Crystal=0.051
**Scale 512D**: Feature=0.885, Cantor=0.023, Crystal=0.092
**Scale 768D**: Feature=0.996, Cantor=0.001, Crystal=0.003
**Scale 1024D**: Feature=0.952, Cantor=0.005, Crystal=0.043
**Scale 1280D**: Feature=0.411, Cantor=0.003, Crystal=0.587


**Pattern**: Lower scales rely on feature similarity, higher scales use crystal geometry.
This hierarchical strategy emerges from training.

## 💻 Usage

```python
import torch
from safetensors.torch import load_file
from geovocab2.train.model.core.geo_fractal_david import GeoFractalDavid

# Load model
model = GeoFractalDavid(
    feature_dim=512,
    num_classes=1000,
    k=5,
    scales=[256, 384, 512, 768, 1024, 1280],
    alpha_init=0.5,
    tau=0.25
)

state_dict = load_file("weights/.../best_model_acc{best_acc:.2f}.safetensors")
model.load_state_dict(state_dict)
model.eval()

# Inference
with torch.no_grad():
    logits = model(features)  # [batch_size, 1000]
    predictions = logits.argmax(dim=-1)

# Inspect learned structure
print(f"Global Alpha: {{model.cantor_stairs.alpha.item():.4f}}")
geo_weights = model.get_geometric_weights()
cantor_dist = model.get_cantor_interval_distribution(sample_features)
```

## 🎓 Training Details

**Loss Function**: Contrastive Geometric Basin
- Primary: Maximize correct class compatibility, minimize incorrect
- Regularization: Cantor coherence, separation, discretization

**Optimization**:
- Optimizer: AdamW with separate learning rates
  - Scales: {config.learning_rate}
  - Fusion weights: {config.learning_rate * 0.5}
  - Cantor stairs: {config.learning_rate * 0.1}
- Weight decay: {config.weight_decay}
- Gradient clipping: {config.gradient_clip}
- Scheduler: {config.scheduler_type}

**Data**:
- Dataset: ImageNet-1K CLIP features ({config.model_variant})
- Batch size: {config.batch_size}
- Training samples: 1,281,167
- Validation samples: 50,000

**Hub Upload**: {"Periodic (every " + str(config.hub_upload_interval) + " epochs)" if config.hub_upload_interval > 0 else "End of training only"}

## 🔑 Key Innovation

**No Cross-Entropy on Arbitrary Weights**

Traditional: `cross_entropy(W @ features + b, labels)`
- W and b are arbitrary learned parameters

**Geometric Basin**: `contrastive_loss(compatibility_scores, labels)`
- Compatibility from geometric structure:
  - Feature ↔ Simplex centroid similarity
  - Feature ↔ Cantor prototype coherence
  - Feature ↔ Simplex vertex distance
- Cross-entropy applied to geometrically meaningful scores
- Structure enforced through geometric regularization

Result: Classification emerges from geometric organization, not arbitrary mappings.

## 📊 Visualizations

The repository includes visualizations of learned structure:
- Cantor prototype distributions (histograms per scale)
- Sorted prototype curves (showing smooth manifold)
- Cross-scale analysis (mean, variance, geometric weights)

See `weights/{model_name}/{config.run_id}/` for generated plots.

## 📁 Repository Structure

```
weights/{model_name}/{config.run_id}/
  ├── best_model_acc{best_acc:.2f}.safetensors    # Model weights
  ├── best_model_acc{best_acc:.2f}_metadata.json  # Training metadata
  ├── train_config.json                          # Training configuration
  ├── training_history.json                      # Epoch-by-epoch history
  ├── cantor_prototypes_distribution.png         # Histogram analysis
  ├── cantor_prototypes_sorted.png              # Sorted manifold view
  └── cantor_prototypes_cross_scale.png         # Cross-scale comparison

runs/{model_name}/{config.run_id}/
  └── events.out.tfevents.*                      # TensorBoard logs
```

**Note**: Visualizations (*.png) are generated by running the probe script and should be
copied to the weights directory before uploading to Hub.

## 🔬 Research

This architecture demonstrates:
1. **Rapid learning** (70%+ after 1 epoch, comparable to FractalDavid)
2. **Geometric organization** (classes spread smoothly in Cantor space)
3. **Hierarchical strategy** (scales learn different geometric weightings)
4. **Emergent structure** (alpha stays near 0.5, prototypes cluster naturally)

The geometric constraints guide learning toward structured representations
without explicit supervision of the geometric components.

## 📝 Citation

```bibtex
@software{{geofractaldavid2025,
  title = {{GeoFractalDavid: Geometric Basin Classification}},
  author = {{AbstractPhil}},
  year = {{2025}},
  url = {{https://huggingface.co/{config.hf_repo if config.hf_repo else 'MODEL_REPO'}}},
  note = {{Multi-scale geometric basin classifier with k-simplex structure}}
}}
```

## 📄 License

MIT License - See LICENSE file for details.

---

*Model trained on {datetime.now().strftime('%Y-%m-%d')}*  
*Run ID: {config.run_id}*