Mathematical Foundations of Memory Address Mapping: From C Pointers to General Theory

[+pointer]: A pointer is a variable that stores the memory address of another variable.

[+dereference]: The dereference operator (*) retrieves the value stored at the memory address held by a pointer.

[+reference]: The reference operator (&) returns the memory address of a variable.

[+memory-address]: A unique identifier for a location in computer memory.

[+category-theory]: A branch of mathematics that deals with abstract structures and relationships between them.

[+topology]: The study of properties of space that are preserved under continuous deformations.

[+algebraic-structure]: A set equipped with one or more operations that satisfy specific axioms.

Info

This article grew out of exploratory notes from my parallel computing study. It is a speculative learning note, not a formal semantics of C. I use C pointers as a small entry point, then sketch how memory-address ideas can be described with simple mathematical language: functions, composition, address spaces, hierarchy, and distributed mappings.

Introduction

Let’s examine a classic C programming question:

int AValue = 101;
int *BValue = &AValue;
// Why does *BValue return 101?

On the surface, this seems simple. The useful question is what kind of mapping is hiding underneath it. While we start with C pointers as the example, the same style of thinking appears again in virtual memory, cache hierarchy, distributed storage, and other memory-like systems.

The mathematical language below should be read as modelling intuition. Some equations are deliberately simplified so that the structural idea is visible before the real-world edge cases are added back.

Mathematical Memory Model

Let’s model computer memory as a mathematical function. Define:

$M: \mathbb{N} \rightarrow \mathbb{Z}$

Where:

$M(a)$ represents the value stored at memory address $a$
Each variable has a unique memory address
$\mathbb{N}$ represents the set of natural numbers (memory addresses)
$\mathbb{Z}$ represents the set of integers (possible values)

This function-based model is only a first approximation, but it gives us a useful way to talk about pointers before moving to a more abstract memory mapping framework.

Variable and Pointer Definitions

Memory Address Assignment

Define:

Let $a_A \in \mathbb{N}$ be the memory address of variable AValue
Let $a_B \in \mathbb{N}$ be the memory address of variable BValue

Reference and Dereference Operators

The reference operator $\&$ and dereference operator $*$ can be defined mathematically:

$\&x = a_x \quad \text{(returns address of variable x)}$

$*p = M(p) \quad \text{(returns value at address p)}$

These definitions capture the essence of pointer operations in mathematical terms.

Step-by-Step Mathematical Analysis

Let’s trace through our C code example using this mathematical framework.

Step 1: Initial Assignment

int AValue = 101;

This operation modifies our memory function: $M(a_A) = 101$

Step 2: Pointer Assignment

int *BValue = &AValue;

This sets the value at $a_B$ to the address of AValue: $M(a_B) = a_A$

Step 3: Dereferencing Operation

Now, when we evaluate *BValue: $*\text{BValue} = M(M(a_B))$

Substituting from Step 2: $*\text{BValue} = M(a_A)$

Substituting from Step 1: $*\text{BValue} = 101$

Function Composition Analysis

The reference and dereference operators suggest a useful mathematical relationship.

Function Definitions

Define the reference function: $R: V \rightarrow A$ where $R(x) = a_x$

Define the dereference function: $D: A \rightarrow V$ where $D(p) = M(p)$

Mapping Properties

Forward Mapping (Reference): $R: V \rightarrow A$
- Maps variable values to their memory addresses
- $R(x) = a_x$ (address of variable x)
Inverse Mapping (Dereference): $D: A \rightarrow V$
- Maps memory addresses to their stored values
- $D(p) = M(p)$ (value at address p)

Limited Inverse-Style Intuition

On a named, allocated variable in this toy model, dereferencing the address of a variable recovers the stored value:

$D(R(x)) = M(a_x) = x$

The converse direction is more fragile:

$R(D(p)) = a_{M(p)} = p$

This second equation is not a general law of C. It only makes sense in an oversimplified model where every value can be treated as a uniquely addressable variable. In real programs, two locations can store the same value, temporary values may have no stable address, and invalid pointers can make dereference undefined. I keep the notation because it gives a useful first intuition, but it should not be read as a full proof.

Bijective Relationship

For reference and dereference to behave like true inverses in a model, the model would need restrictive assumptions:

Injective (One-to-one):
- If $R(x_1) = R(x_2)$ , then $a_{x_1} = a_{x_2}$ , so $x_1 = x_2$
- If $D(p_1) = D(p_2)$ , then $M(p_1) = M(p_2)$ , so $p_1 = p_2$
Surjective (Onto):
- For every address $a \in A$ , there exists a variable $x \in V$ such that $R(x) = a$
- For every value $v \in V$ , there exists an address $p \in A$ such that $D(p) = v$

Composition Identity

Under those restrictions, the toy model suggests: $D \circ R = I_V \quad \text{(identity function on values)}$ $R \circ D = I_A \quad \text{(identity function on addresses)}$

The inverse relationship can then be expressed as: $R^{-1} = D \quad \text{and} \quad D^{-1} = R$

Therefore: $R^{-1}(R(x)) = x \quad \text{and} \quad D^{-1}(D(p)) = p$

Category-Theory Vocabulary as a Sketch

A possible memory category

If we wanted a more formal model, one possible starting point would be to describe memory systems with category-theory[+category-theory] vocabulary:

Objects: Memory spaces $(A, V, M)$ where $A$ is the address space, $V$ is the value space, and $M: A \rightarrow V$ is the memory mapping
Morphisms: Memory-preserving functions between memory spaces
Composition: Function composition of memory mappings

This is only a sketch. The hard part is deciding which structure the morphisms must preserve: raw addresses, allocation regions, permissions, page identity, temporal state, or some combination of them.

A possible memory functor sketch

After those categories are defined, a toy construction could look like a functor $\mathcal{M}: \mathbf{Set} \rightarrow \mathbf{Mem}$ where:

$\mathbf{Set}$ is the category of sets and functions
$\mathbf{Mem}$ is a chosen category of memory spaces and structure-preserving morphisms
$\mathcal{M}$ maps a set $S$ to the memory space $(S, S, \text{id}_S)$

The point is not that this is the right semantics of memory. It is a compact way to ask what kind of structure is being preserved when we move from one memory representation to another.

A cautious categorical analogy

It is tempting to describe reference and dereference with category-theory language, but that needs care. In this note, $R$ and $D$ are ordinary maps inside a toy memory model. They are not automatically functors, and the note does not prove an adjunction.

The safer analogy is simpler:

$R$ records how a named value is associated with an address.
$D$ records how an address is read back into a value.
$D \circ R$ describes the useful “take the address, then load from it” pattern under restrictive assumptions.

Category theory becomes relevant only after we define precise categories of memory states and structure-preserving maps between them. I leave that as an intuition, not as a theorem.

Topology of Memory Spaces

Memory as a Topological Space

We can endow the address space $A$ with a topology[+topology] that reflects the structure of memory:

Open Sets: Represent accessible memory regions
Closed Sets: Represent protected or reserved memory regions
Continuity: Memory operations that preserve accessibility

Metric Properties

Define a metric $d: A \times A \rightarrow \mathbb{R}$ on the address space:

$d(a_1, a_2) = |a_1 - a_2|$

This metric induces the standard topology on $\mathbb{N}$ , where:

Balls: $B_r(a) = \{x \in A : |x - a| < r\}$ represent memory neighborhoods
Continuity: For raw stored values this is usually not a realistic assumption; it is more useful as an analogy for preserving accessibility or layout relationships

Compactness and Memory Allocation

Compact Sets: Finite memory regions that can be completely allocated
Connected Components: Contiguous memory blocks
Separation Properties: Memory protection and isolation

Homeomorphisms and Memory Remapping

A memory remapping $f: A \rightarrow A'$ is a homeomorphism if:

$f$ is bijective
Both $f$ and $f^{-1}$ are continuous
$f$ preserves memory accessibility relationships

This is only a rough analogy for address remapping. Real virtual memory translation is permissioned, partial, page-based, and not usually a homeomorphism in the strict topological sense.

Algebraic Structures in Memory

Offset arithmetic, not a global address algebra

It would be too strong to say that real program addresses form an abelian group or a ring. C pointer arithmetic is typed, bounded by allocated objects, and can become undefined outside those bounds. The null pointer is also not a neutral element for ordinary address addition.

A safer model is local offset arithmetic. Within one allocated array-like region, an address can be represented as:

$a_i = a_0 + i \cdot \text{sizeof}(T)$

This local model inherits some familiar arithmetic laws, but only inside the valid region and only for the relevant element type. It is a tool for reasoning about layout, not a global algebra of all memory addresses.

Homomorphisms Between Memory Models

A memory homomorphism $\phi: M_1 \rightarrow M_2$ preserves the algebraic structure:

$\phi(a + b) = \phi(a) + \phi(b)$ $\phi(a \cdot b) = \phi(a) \cdot \phi(b)$

This can model memory abstraction layers only when the chosen maps actually preserve the structure we care about, such as offsets, permissions, or page identity.

Algebraic Proof

Given the assignment $\text{BValue} = \&\text{AValue}$ , we can prove:

$*\text{BValue} = *(\&\text{AValue}) = D(R(\text{AValue})) = \text{AValue} = 101$

This toy calculation shows why the simple pointer example behaves as expected.

Memory Mapping Visualization

The memory mapping can be represented as:

\begin{align*} a_A &\mapsto 101 \\ a_B &\mapsto a_A \end{align*}

Therefore: $*\text{BValue} = M(M(a_B)) = M(a_A) = 101$

Info

This double dereference pattern $M(M(a_B))$ is fundamental to understanding pointer behavior. The first $M$ gets the address stored in the pointer, and the second $M$ gets the value at that address.

Practical Implications

Pointer Aliasing

When two pointers point to the same memory location, we have: $M(a_{p1}) = M(a_{p2}) = a_x$

This means: $*p1 = *p2 = M(a_x)$

Pointer Arithmetic

Pointer arithmetic can be modeled as: $p + n = a_p + n \times \text{sizeof}(type)$

Where $\text{sizeof}(type)$ is the size of the pointed-to type in bytes.

Null Pointers

A null pointer can be represented as: $a_{null} = 0$

With the special property: $M(0) = \text{undefined}$

Common Pointer Patterns

Function Pointers

Function pointers extend our model: $M(a_f) = \text{function address}$

$*a_f = \text{function execution}$

Double Pointers

Double pointers (pointers to pointers) create nested mappings: $M(a_{pp}) = a_p$

$M(a_p) = a_x$

$**pp = M(M(a_{pp})) = M(a_p) = a_x$

Memory Safety Considerations

Our mathematical model helps explain common pointer errors:

Dangling Pointers

$M(a_p) = \text{invalid address}$

$*p = M(\text{invalid address}) = \text{undefined behavior}$

Buffer Overflows

When $p + n$ exceeds allocated memory bounds: $M(a_p + n) = \text{unauthorized memory access}$

General Memory Mapping Theory

Abstract Memory Mapping Framework

Let’s generalize beyond our initial simple model. A Memory Mapping System is a tuple $\mathcal{M} = (A, V, M, \mathcal{O})$ where:

$A$ : Address space (possibly structured)
$V$ : Value space (possibly structured)
$M: A \rightarrow V$ : Memory mapping function
$\mathcal{O}$ : Set of memory operations

Classification of Memory Mappings

By Injectivity

Injective: $M(a_1) = M(a_2) \Rightarrow a_1 = a_2$ (no aliasing)
Non-injective: Allows multiple addresses to map to same value

By Surjectivity

Surjective: $\forall v \in V, \exists a \in A : M(a) = v$ (full coverage)
Non-surjective: Some values cannot be stored

By Composition Properties

Idempotent: $M \circ M = M$
Involutive: $M \circ M = \text{id}$
Nilpotent: $\exists n : M^n = 0$

Memory Mapping Composition

Given two memory systems $\mathcal{M}_1 = (A_1, V_1, M_1, \mathcal{O}_1)$ and $\mathcal{M}_2 = (A_2, V_2, M_2, \mathcal{O}_2)$ , we can define:

Direct Sum

$\mathcal{M}_1 \oplus \mathcal{M}_2 = (A_1 \sqcup A_2, V_1 \sqcup V_2, M_1 \sqcup M_2, \mathcal{O}_1 \sqcup \mathcal{O}_2)$

Tensor Product

$\mathcal{M}_1 \otimes \mathcal{M}_2 = (A_1 \times A_2, V_1 \times V_2, M_1 \times M_2, \mathcal{O}_1 \times \mathcal{O}_2)$

Function Composition

If $V_1 = A_2$ , then: $\mathcal{M}_2 \circ \mathcal{M}_1 = (A_1, V_2, M_2 \circ M_1, \mathcal{O}_2 \circ \mathcal{O}_1)$

Memory Mapping Decomposition

A memory mapping $M: A \rightarrow V$ can be decomposed as:

$M = \pi_V \circ \iota_A$

Where:

$\iota_A: A \rightarrow A \times V$ is the inclusion map
$\pi_V: A \times V \rightarrow V$ is the projection map

This decomposition is mostly a reminder that a memory read can be factored through a richer paired representation. I would not treat it as a universal property unless the surrounding category and morphisms are defined explicitly.

Optional universal-property sketch

In a more formal development, one might ask whether there is a “free” memory-like construction $F(A)$ generated by an address set $A$ . Such a construction would have to satisfy a universal property of the form:

$\forall M: A \rightarrow V, \exists ! \tilde{M}: F(A) \rightarrow V \text{ such that } M = \tilde{M} \circ \eta_A$

where $\eta_A: A \rightarrow F(A)$ embeds addresses into the generated structure. This note does not build that category; it only records the shape of the question.

Memory mapping categories

If such a category Mem is defined, it could contain:

Objects: Memory mapping systems
Morphisms: Memory-preserving functions
Composition: Function composition

Only after that definition could we ask whether it has:

Products: Direct products of memory systems
Coproducts: Disjoint unions of memory systems
Exponential Objects: Function spaces between memory systems

Multi-level Memory Hierarchy

Hierarchical Memory Systems

A Memory Hierarchy is a sequence $\mathcal{H} = (\mathcal{M}_1, \mathcal{M}_2, \ldots, \mathcal{M}_n)$ where:

$\mathcal{M}_1$ : Level 1 (fastest, smallest)
$\mathcal{M}_n$ : Level n (slowest, largest)
Transfer functions $T_{i,j}: \mathcal{M}_i \rightarrow \mathcal{M}_j$ for $i < j$

Performance Metrics

Define the Access Time Function $t: \mathcal{H} \rightarrow \mathbb{R}^+$ :

$t(\mathcal{M}_i) = t_i \text{ (base access time for level i)}$

The Effective Access Time for memory access pattern $P$ :

$T_{eff}(P) = \sum_{i=1}^n h_i(P) \cdot t_i$

Where $h_i(P)$ is the hit rate at level $i$ for pattern $P$ .

Cache Memory Mathematics

For a cache with associativity $k$ , the Cache Hit Probability is:

$P_{hit} = \sum_{i=0}^{k-1} P_{locality}(i)$

Where $P_{locality}(i)$ is the probability of accessing within $i$ positions.

Optimal Replacement Strategies

The Optimal Replacement Problem can be formulated as:

$\min_{R} \sum_{t=1}^T \mathbb{I}_{miss}(t, R)$

Where $R$ is the replacement strategy and $\mathbb{I}_{miss}$ is the miss indicator.

Virtual Memory Theory

Address Translation Functions

Virtual memory systems use address translation:

$\tau: V_{virtual} \rightarrow V_{physical}$

Where $\tau$ is typically piecewise defined based on page tables.

Page Table Mathematics

A page table can be represented as a function:

$PT: \text{VPN} \rightarrow \text{PPN} \times \text{Permissions}$

Where:

VPN: Virtual Page Number
PPN: Physical Page Number
Permissions: Access rights bitmap

Translation Lookaside Buffer (TLB)

The TLB is a cache for address translations:

$TLB: \text{VPN} \rightarrow \text{PPN} \times \text{Permissions}$

With TLB Hit Rate:

$P_{TLB} = \frac{\text{Number of TLB hits}}{\text{Total memory accesses}}$

Memory Fragmentation

External Fragmentation

The External Fragmentation Ratio:

$F_{ext} = 1 - \frac{\text{Largest contiguous free block}}{\text{Total free memory}}$

Internal Fragmentation

The Internal Fragmentation Ratio:

$F_{int} = \frac{\text{Wasted space within allocated blocks}}{\text{Total allocated memory}}$

Concurrent Memory Access Models

Shared Memory Systems

For concurrent access, we extend our model to:

$M: A \times T \rightarrow V$

Where $T$ is the set of time points or thread identifiers.

Consistency Models

Sequential Consistency

$\forall \text{ operations } o_1, o_2, o_1 \text{ appears before } o_2 \text{ in some global order}$

Linearizability

Each operation appears to occur atomically at some point between its invocation and completion.

Memory Fences and Synchronization

Define Memory Orderings as relations on memory operations:

Sequential Consistent: Total order on all operations
Release-Acquire: Synchronizes operations via release/acquire pairs
Relaxed: No ordering guarantees beyond program order

Race Condition Analysis

A Data Race occurs when:

$\exists t_1 \neq t_2, a \in A : \text{write}(t_1, a) \parallel \text{write}(t_2, a)$

Where $\parallel$ denotes concurrent execution without synchronization.

Applications Across Domains

Database Memory Management

Database systems use memory mapping to implement:

Buffer Pools: $\text{BufferPool}: \text{DiskPages} \rightarrow \text{MemoryFrames}$
Index Structures: $B^+$ -trees as hierarchical memory mappings
Query Processing: Intermediate result mappings

The mathematical model for database buffer pools can be represented as:

$\text{BufferPool}: \text{DiskPages} \rightarrow \text{MemoryFrames} \times \{\text{clean}, \text{dirty}\}$

Where each memory frame has a state indicating whether it needs to be written back to disk.

Replacement strategies can be modeled as:

$R: \text{MemoryFrames} \times \text{AccessPattern} \rightarrow \text{VictimFrame}$

The optimal replacement strategy minimizes page faults:

$\min_{R} \sum_{t=1}^{T} \mathbb{I}_{\text{page fault}}(t, R)$

Distributed Computing

Distributed memory systems involve:

Partitioning: $\text{Partition}: \text{GlobalAddress} \rightarrow \text{NodeID} \times \text{LocalAddress}$
Replication: Multiple mappings for same data
Consistency: Distributed memory consistency protocols

The address mapping in distributed systems can be represented as:

$\text{GlobalMapping}: \text{GlobalAddress} \rightarrow \text{NodeID} \times \text{LocalAddress}$

Consistency models define synchronization rules between multiple replicas:

Strong consistency: All replicas are identical at all times
Eventual consistency: Replicas converge to the same state over time
Causal consistency: Causally related operations are executed in order

The mathematical model for data replication:

$\text{Replication}: \text{Data} \rightarrow \{\text{Node}_1, \text{Node}_2, \ldots, \text{Node}_n\}$

Where each node stores a copy of the data.

Conclusion

The pointer dereference $*\text{BValue}$ evaluates to 101 because BValue stores the address of AValue, and dereferencing follows that address back to the stored integer. In the toy notation, this is the useful part of the model:

$D(R(\text{AValue})) = \text{AValue}$

That identity is a teaching shorthand, not a complete account of C memory. The value of the notation is that it trains the same mental habit needed for larger systems: name the address space, name the value space, and ask what mapping connects them.

This is why I still keep the article. It is not a finished theory, but it points toward a useful bridge between low-level programming and mathematical modelling: from C pointers to virtual memory, cache behaviour, concurrent memory, and distributed address spaces.

Introduction

Mathematical Memory Model

Variable and Pointer Definitions

Memory Address Assignment

Reference and Dereference Operators

Step-by-Step Mathematical Analysis

Step 1: Initial Assignment

Step 2: Pointer Assignment

Step 3: Dereferencing Operation

Function Composition Analysis

Function Definitions

Mapping Properties

Limited Inverse-Style Intuition

Bijective Relationship

Composition Identity

Category-Theory Vocabulary as a Sketch

A possible memory category

A possible memory functor sketch

A cautious categorical analogy

Topology of Memory Spaces

Memory as a Topological Space

Metric Properties

Compactness and Memory Allocation

Homeomorphisms and Memory Remapping

Algebraic Structures in Memory

Offset arithmetic, not a global address algebra

Homomorphisms Between Memory Models

Algebraic Proof

Memory Mapping Visualization

Practical Implications

Pointer Aliasing

Pointer Arithmetic

Null Pointers

Common Pointer Patterns

Function Pointers

Double Pointers

Memory Safety Considerations

Dangling Pointers

Buffer Overflows

General Memory Mapping Theory

Abstract Memory Mapping Framework

Classification of Memory Mappings

By Injectivity

By Surjectivity

By Composition Properties

Memory Mapping Composition

Direct Sum

Tensor Product

Function Composition

Memory Mapping Decomposition

Optional universal-property sketch

Memory mapping categories

Multi-level Memory Hierarchy

Hierarchical Memory Systems

Performance Metrics

Cache Memory Mathematics

Optimal Replacement Strategies

Virtual Memory Theory

Address Translation Functions

Page Table Mathematics

Translation Lookaside Buffer (TLB)

Memory Fragmentation

External Fragmentation

Internal Fragmentation

Concurrent Memory Access Models

Shared Memory Systems

Consistency Models

Sequential Consistency

Linearizability

Memory Fences and Synchronization

Race Condition Analysis

Applications Across Domains

Database Memory Management

Distributed Computing

Conclusion

Discussion