Dispatch Levels

Level 1 — OaModule + Autograd

Subclass OaModule, register layers, override Forward.OaFnGrad::Backward propagates gradients. OaAdamW updates weights. The same pattern scales from a two-layer MLP to a full transformer.

When to Use

Prototyping new architectures
Research and experimentation
Models where topology changes between steps
Starting point before moving to Level 2 for production

Full Example

Level1.cpp

// Tutorial/Ml/TutorialMnistClassifier.cpp
// Fashion-MNIST — 83.2% test accuracy, 244K samples/s (RTX 5090 Laptop)

class OaMnistClassifier : public OaModule {
public:
    OaMnistClassifier() {
        Fc1_ = OaMakeSharedPtr<OaLinear>(784, 128);
        Fc2_ = OaMakeSharedPtr<OaLinear>(128, 10);
        RegisterModule("fc1", Fc1_);
        RegisterModule("fc2", Fc2_);
    }
    OaDeviceMatrix Forward(const OaDeviceMatrix& x) override {
        auto h = OaFnMatrix::Scale(x, 1.0f / 255.0f); // normalize
        h = OaFnMatrix::Relu(Fc1_->Forward(h));        // 784 -> 128, ReLU
        return Fc2_->Forward(h);                       // 128 -> 10
    }
private:
    OaSharedPtr<OaLinear> Fc1_, Fc2_;
};

int main() {
    auto rt = OaEngine::Create({.AppName = "Mnist"}).Unwrap();
    OaMnistClassifier model;
    OaAdamW opt(model.AllParameterPtrs(), 0.001f);
    OaFnGrad::SetMode(OaGradMode::Dynamic);

    for (OaI32 step = 0; step < 2000; ++step) {
        sampler.NextBatch(batchX, batchY);
        auto logits = model.Forward(batchX);
        auto loss   = OaFnMatrix::CrossEntropyLoss(logits, batchY);
        OaFnGrad::Backward(loss);
        opt.Step();
        opt.ZeroGrad();
    }
}

Available Layers

Layer	Description
`OaLinear`	Fully connected layer with optional bias
`OaEmbedding`	Token embedding lookup
`OaLayerNorm`	Layer normalisation
`OaRMSNorm`	RMS normalisation (no mean shift)
`OaSequential`	Ordered container of sub-modules

OaFnMatrix ops used in training

OaFnMatrix::Relu, Tanh, Softmax
OaFnMatrix::Scale — scalar multiply (e.g. ÷255 normalisation)
OaFnMatrix::CrossEntropyLoss — accepts U8 class indices directly
OaFnGrad::Backward(loss) — reverse-mode autodiff from scalar loss

Optimizers

OaAdamW — AdamW with decoupled weight decay (default: 0.01)
OaSGD — stochastic gradient descent with momentum