armfortas Public

trunk

Branches trunk

1 Branches 0 Tags

Go to file T

espadonne Merge pull request #30 from FortranGoingOnForty/pr/05-allocate-strides-shadow

d2d23dd 3 days ago 1942 Commits

.benchmarks	Add performance regression gate with baseline	1 month ago
.github	Pre-build integration test binaries before timed test step + raise end-to-end timeout 60→120 min	6 days ago
fuzz	Add fuzzing harness: cargo-fuzz targets and deterministic smoke tests	1 month ago
runtime	Update allocate_like test to canonical column-major stride expectation	3 days ago
sample_programs	Restore automatic snake movement	1 month ago
scripts	Add incremental compilation tests and fix coverage script PATH	1 month ago
src	Allow type_complexity on lower_section_write_nd dim_data local	3 days ago
test_programs	Update integer16_internal_format IR/ASM checks for _ex symbol	3 days ago
tests	Test intrinsic repeat survives user-generic shadow	3 days ago
.gitignore	Ignore runtime artifacts left in repo root by stdlib example runs	1 week ago
.gitmodules	add afs-ld submodule and wire into workspace	4 weeks ago
Cargo.lock	expand CLI driver: full flag set, info actions, afs alias, phase timer	4 weeks ago
Cargo.toml	Tune dev compiler profile	3 weeks ago
LICENSE	Add GNU General Public License v3	1 month ago
PROJECT_CAMPAIGN.md	Track fgof campaign expansion	3 weeks ago
README.md	remove conflict marker	2 weeks ago
afs-as @ cea2569	Advance toolchain submodules	2 weeks ago
afs-ld @ 72e7349	Advance toolchain submodules	2 weeks ago
bencch @ 56efe6d	Track fgof campaign expansion	3 weeks ago
noted_items.md	Burn down final audit xfails	1 month ago

armfortas

A Fortran compiler for ARM64. No borrowed frontends, no LLVM, no GCC. Every stage from lexing to machine code is ours.

Why

fortsh is a ~57,000-line Fortran 2018 shell. When we went to compile it on Apple Silicon we found out that Fortran on ARM64 is, charitably, underserved:

gfortran has at least seven confirmed critical bugs on macOS ARM64, most of them involving allocatable strings — the exact feature fortsh leans on hardest. The bugs live in GCC's backend and the GCC team's queue for ARM64 Fortran is not short.
flang-new (LLVM) works around the gfortran bugs but introduces its own, particularly around C interop and derived type layout. It also requires a separate Homebrew installation and the binary is called flang-new for reasons.
Both compilers are millions of lines of code we don't own. When something breaks in a corner of AArch64 AAPCS64 that nobody expected a Fortran program to reach, "read the source and fix it" is not a realistic option.

The solution was to write a compiler that runs on the machine we actually have, that we can fix when it breaks.

Status

Active development. The full pipeline — preprocessor through Mach-O object emission — is working. Real Fortran programs compile and run.

Pipeline: Source → Preprocessor → Lexer → Parser → AST →
          Sema → SSA IR → Optimizations → ARM64 Codegen →
          afs-as → Mach-O .o → ld → Binary

The system linker (ld) is the only component we delegate. Everything else is ours.

Build

git clone --recurse-submodules https://github.com/FortranGoingOnForty/armfortas.git
cd armfortas
cargo build --workspace          # compiler + assembler + runtime
cargo test --workspace           # full test suite
cargo clippy --workspace         # lint

Once built:

target/debug/armfortas hello.f90 -o hello    # compile and link
target/debug/armfortas -c module.f90         # compile to object
target/debug/armfortas -S hello.f90          # emit assembly
target/debug/armfortas --emit-ir hello.f90   # emit IR

What Works

Language coverage (F77 through F2018)

Free-form and fixed-form source
All numeric types: integer, real, double precision, logical, character
Complex arithmetic (storage and arithmetic operations; some intrinsics pending)
Derived types with component access, type extension (EXTENDS), and type-bound procedures with PASS/NOPASS
FINAL procedures
SELECT TYPE with TYPE IS and CLASS IS guards
ALLOCATABLE scalars and arrays, including allocatable character strings
POINTER and TARGET attributes
OPTIONAL arguments with PRESENT() intrinsic
Full array sections and whole-array expressions
WHERE / FORALL constructs
DO, DO WHILE, DO CONCURRENT with locality specs
SELECT CASE on integer, character, and logical
ASSOCIATE and BLOCK constructs
GOTO and labeled statements
EQUIVALENCE and COMMON blocks
NAMELIST I/O
SAVE attribute with correct static storage
VALUE attribute for pass-by-value (BIND(C))
RECURSIVE functions and subroutines
Generic procedures and interfaces
Operator overloading
Statement functions
Arithmetic IF
STOP / ERROR STOP with stop codes

C interoperability (`iso_c_binding`)

Full iso_c_binding module: kind parameters (C_INT, C_DOUBLE, C_CHAR, etc.), C_PTR, C_NULL_PTR, C_LOC, C_FUNPTR, BIND(C) procedures with correct ABI including VALUE argument dispatch.

I/O

PRINT and WRITE with format strings and list-directed I/O
READ from stdin and files
OPEN, CLOSE, INQUIRE, REWIND, BACKSPACE, ENDFILE, FLUSH
Unformatted (binary) I/O
Stream I/O
Non-advancing I/O
FORMAT statements
NAMELIST groups

Intrinsics

Mathematical: ABS, SQRT, EXP, LOG, LOG10, SIN, COS, TAN, ASIN, ACOS, ATAN, ATAN2, SINH, COSH, TANH, MOD, MODULO, SIGN, DIM, FLOOR, CEILING, NINT, INT, REAL, DBLE, MAX, MIN, MAXVAL, MINVAL, SUM, PRODUCT

Array: SIZE, SHAPE, LBOUND, UBOUND, ALLOCATED, ASSOCIATED, RESHAPE, TRANSPOSE, MATMUL, DOT_PRODUCT, PACK, UNPACK, SPREAD, MERGE, COUNT, ANY, ALL

Character: LEN, LEN_TRIM, TRIM, ADJUSTL, ADJUSTR, INDEX, SCAN, VERIFY, CHAR, ICHAR, ACHAR, IACHAR, REPEAT, NEW_LINE

Bit: IAND, IOR, IEOR, NOT, ISHFT, ISHFTC, IBITS, IBSET, IBCLR, BTEST, POPCNT, POPPAR, LEADZ, TRAILZ

System: SYSTEM_CLOCK, CPU_TIME, DATE_AND_TIME, RANDOM_NUMBER, RANDOM_SEED, COMMAND_ARGUMENT_COUNT, GET_COMMAND_ARGUMENT, GET_COMMAND, GET_ENVIRONMENT_VARIABLE

Inquiry: KIND, SELECTED_INT_KIND, SELECTED_REAL_KIND, HUGE, TINY, EPSILON, PRECISION, RANGE, DIGITS, RADIX, MINEXPONENT, MAXEXPONENT

Optimization passes

Level	Passes
`-O0`	None (preserve IR exactly)
`-O1`	mem2reg, constant folding, local CSE, constant propagation, DCE
`-O2`	`-O1` + strength reduction, LICM
`-O3`	Same as `-O2` (vectorization and IPO deferred — see below)
`-Ofast`	`-O3` + fast-math reassociation for float add/sub constant chains

Correctness invariant: every program that produces correct output at -O0 must produce identical output at -O1, -O2, -O3. This is enforced by the end-to-end test suite at every level.

Modules

iso_c_binding and iso_fortran_env are built-in and always available. Authored modules compile correctly. Multi-file module dependency resolution and .amod files are in progress (Sprint 30).

What Doesn't Work Yet

In progress or deferred:

Complex number intrinsics (REAL(), AIMAG(), CONJG(), CMPLX()) — storage works, intrinsic calls don't
Stack frames larger than ~32KB — prologue/epilogue currently broken above that threshold; practically affects very large local arrays
-O3 vectorization (NEON/SIMD) — accepted and correct but runs -O2 passes
Function inlining — not yet implemented at any level
Multi-file compilation and .amod module files — Sprint 30
Loop unrolling, GVN, SROA, dead store elimination — later optimizer sprints
ieee_arithmetic, ieee_exceptions modules
Coarray Fortran
Submodules
Vtable-based polymorphic dispatch through CLASS variables

Architecture

armfortas/
├── afs-as/          Standalone ARM64 assembler (git submodule)
│   └── src/         Instruction encoding, .s parser, Mach-O emission
├── src/
│   ├── preprocess/  Fortran-aware preprocessor (#ifdef, #include, #define)
│   ├── lexer/       Tokenization — free-form + fixed-form
│   ├── parser/      Recursive descent → AST
│   ├── ast/         AST node definitions
│   ├── sema/        Symbol tables, type system, validation
│   ├── ir/          SSA-form IR with block parameters (no phi nodes)
│   ├── opt/         Optimization passes and pass manager
│   ├── codegen/     ARM64 instruction selection, linear scan register allocation
│   ├── driver/      CLI, compilation orchestration
│   └── runtime/     libarmfortas_rt — I/O, intrinsics, memory management
├── bencch/          Compiler benchmark and test harness (git submodule)
├── test_programs/   ~110 end-to-end test programs with CHECK annotations
└── runtime/         Runtime library source

Key design decisions

No LLVM. gfortran's bugs are in GCC's backend. flang's bugs are in LLVM's frontend lowering. Using either as a backend would mean inheriting the bugs we're trying to escape. We own every pass.

SSA IR with block parameters. Instead of phi nodes, blocks carry typed parameters. Cleaner to construct, easier to verify, simpler to transform. The mem2reg pass promotes stack allocas to SSA values using iterated dominance frontiers (Cytron et al.).

Apple AAPCS64 strictly. 16-byte stack alignment always, x18 reserved, x29/x30 saved in prologue, frame pointer maintained. We've been bitten by every one of these constraints and handle them correctly.

Array descriptors. {base_addr, elem_size, rank, flags, dims[15]}. Our ABI — stable across releases.

String descriptors. {data, len, capacity, flags}. Deferred-length assignment always allocates new storage before freeing old. This prevents the use-after-free that causes gfortran's ARM64 allocatable string crashes.

Large arrays on heap. Stack threshold at 64KB. Prevents the stack corruption gfortran exhibits with arrays over ~600KB.

afs-as is the standalone ARM64 assembler. It knows nothing about Fortran — clean API boundary. It can be used independently to assemble ARM64 .s files.

Testing

cargo test --workspace                            # all unit + integration tests
cargo test --test run_programs                    # end-to-end at -O0
cargo test --test run_programs -- --nocapture     # verbose output
cargo run -p afs-tests -- run --suite runtime     # bencch runtime suite
cargo run -p afs-tests -- run --suite consistency # reproducibility checks

The root armfortas harness is the fast, armfortas-first runner. It compiles each .f90 file in test_programs/, runs the binary, and evaluates source-embedded assertions such as:

! CHECK: for stdout
! STDERR_CHECK: for runtime stderr
! EXIT_CODE: for exact runtime exit status
! XFAIL: for known open bugs
! ERROR_EXPECTED: for diagnostics that must be emitted
! ERROR_SPAN: for exact diagnostic location
! ASM_CHECK: / ! ASM_NOT: for assembly shape
! FILE_CHECK: / ! FILE_NOT: for sandbox file side effects
! FILE_EXISTS: / ! FILE_MISSING: for explicit sandbox presence or absence
! FILE_LINE_COUNT: for structural file-shape assertions
! FILE_RERUN_MODE: for explicit overwrite vs append intent across reruns
! FILE_SET_EXACT: for exact runtime side-effect file sets
! REPRO_CHECK: for per-test asm/object/run reproducibility
! OPT_EQ: for explicit cross-opt invariants
! PHASE_TRIANGULATE: for same-opt IR/ASM/object availability, compile-cleanliness, and compile-only reproducibility oracles
! IR_CHECK: / ! IR_NOT: for IR shape

Those source comments are the canonical leaf-assertion language for the project. The root harness is where new annotation ideas should land first.

bencch is the structured matrix/reporting/differential runner around that same testing language. It is best for:

opt matrices
differential/reference runs
module graphs
capability-aware execution
reports and bundles

The two surfaces are meant to converge on syntax and expectations, not drift into separate testing dialects.

All root end-to-end tests run at every optimization level (-O0 through -Ofast). Programs with known bugs carry ! XFAIL: annotations that reference the audit finding — they count as passing until the bug is fixed, at which point CI catches the unexpected success.

Target

Architecture: ARM64 (AArch64), Apple Silicon (M1/M2/M3/M4)
OS: macOS (Mach-O, Apple AAPCS64)
Standard: F77 through F2018, building inward from F2018
Goal: Compile fortsh — a 57,000-line Fortran 2018 shell — correctly on Apple Silicon

Compiling fortsh is a milestone, not the finish line. A complete compiler handles code fortsh never exercises.

Relationship to fortsh

This compiler exists because fortsh exists. Building a non-trivial Fortran program on ARM64 and discovering that neither available compiler handles it reliably was the motivation. The goal is a compiler we can fix when it breaks, running on the hardware we actually use.

About

Bespoke ARM64 Fortran Compiler

Report repository

Releases

No releases published

Packages

No packages published

Contributors 3

mfwolffe

espadonne Matthew Forrester Wolffe