8.2 Collections

Managing Data with Collections
Section titled “Managing Data with Collections”Introduction
Section titled “Introduction”Rust’s standard library provides a set of collections designed to work with its ownership and borrowing model. These are similar to the containers in the C++ Standard Template Library (STL), such as std::vector, std::string, and std::unordered_map.
Vec<T>is a growable array for dynamic lists.Stringis a UTF-8 compliant, growable string type.HashMap<K, V>provides key-value storage for fast lookups.
These collections provide familiar functionality with compile-time safety guarantees, which helps reduce memory-related bugs. This is important for systems where reliability is required, such as edge programming.
This section examines Vec<T>, String, and HashMap<K, V>, providing examples and comparisons to their C++ counterparts.
Vec<T>: Dynamic Arrays
Section titled “Vec<T>: Dynamic Arrays”Vec<T> (pronounced “vector”) is Rust’s equivalent of a dynamic array, similar to std::vector<T> in C++. It is a growable list type that can store a variable number of values of the same type T. Vec<T> allocates its elements on the heap, allowing it to grow or shrink at runtime, while the Vec object itself (containing a pointer to the data, its capacity, and its current length) resides on the stack.
Creating a Vec
You can create a Vec using the vec! macro, which can infer the type, or by explicitly specifying the type. An empty Vec can be created with Vec::new().
fn main() { // Create an empty vector of i32 type let mut numbers: Vec<i32> = Vec::new(); numbers.push(10); numbers.push(20); println!("Numbers: {:?}", numbers); // Output: Numbers: [10, 20]
// Create a vector with initial values using the vec! macro let colours = vec!["blue", "red", "green"]; println!("Colours: {:?}", colours); // Output: Colours: ["blue", "red", "green"]
// Create a mutable vector let mut even_numbers = vec![2, 4, 6]; println!("Even numbers: {:?}", even_numbers); // Output: Even numbers: [2, 4, 6]}This code gives the output:
Numbers: [10, 20]Colours: ["blue", "red", "green"]Even numbers: [2, 4, 6]Adding and Removing Elements
Elements can be added to a mutable Vec using the push() method. Elements can be removed using the remove() method, which takes an index.
fn main() { let mut items = vec!["apple", "banana", "cherry"]; println!("Original items: {:?}", items);
// Add an element items.push("date"); println!("After push: {:?}", items);
// Remove an element at index 1 (i.e., banana) items.remove(1); println!("After remove: {:?}", items);}This code gives the output:
Original items: ["apple", "banana", "cherry"]After push: ["apple", "banana", "cherry", "date"]After remove: ["apple", "cherry", "date"]Accessing Elements
Elements can be accessed by indexing using square brackets [] or by using the get() method, which returns an Option<T> to handle out-of-bounds access safely. We cover this concept later in this chapter but there is a brief summary after this code example.
fn main() { let data = vec![1, 2, 6, 7, 8];
// Access the first element using indexing. // This will copy the value because integers implement the Copy trait. // NOTE: This would panic if data were empty. let first = data[0]; println!("First element: {}", first); // Output is: First element: 1
// Access using get() (returns Option, safe for out-of-bounds) match data.get(2) { Some(value) => println!("Third element: {}", value), // Output: Third element: 6 None => println!("Index out of bounds."), }
// Access index 10 that is out of bounds match data.get(10) { Some(value) => println!("Tenth element: {}", value), None => println!("Index out of bounds."), // Output: Index out of bounds. }}This code gives the output:
First element: 1Third element: 6Index out of bounds.Rust’s indexing performs runtime bounds checking, similar to std::vector::at() in C++, which can introduce a minor performance overhead if not optimised away by the compiler. However, this ensures spatial safety, preventing buffer overflows that are common in C/C++ when using raw pointers without manual checks.
Iterating Over a Vec
Typical for loops are the idiomatic way to iterate over a Vec in Rust, leveraging iterators for efficiency and safety.
fn main() { let numbers = vec![1, 2, 6, 7, 8];
// Iterate over immutable references for num in numbers.iter() { println!("Value: {}", num); // See output below }
// Iterate over mutable references (requires mut vector) let mut scores = vec![10, 20, 60, 70, 80]; for score in scores.iter_mut() { *score += 5; // Dereference to modify the value } println!("Updated scores: {:?}", scores); // Outputs updated scores…
// Iterate by taking ownership (consumes the vector) let names = vec!["Derek", "Joe", "Jack"]; for name in names { // names is moved here println!("Name: {}", name); } //println!("{:?}", names); // ERROR: use of moved value: names //using for name in &names would solve this by only allowing //the loop to borrow each element. Using a clone of names would also work.}This code gives the output:
Value: 1Value: 2Value: 6Value: 7Value: 8Updated scores: [15, 25, 65, 75, 85]Name: DerekName: JoeName: JackVec<T> for Edge Programming
For edge programming and embedded systems, where dynamic memory allocation (heap usage) can be problematic due to fragmentation or unpredictable performance, the heap allocation for Vec<T> might be a concern. However, Rust offers two solutions:
heapless::Vec: This crate provides aVecimplementation that allocates its buffer on the stack or in static memory, with a fixed, compile-time defined capacity. This avoids dynamic heap allocations entirely, making it suitable for bare-metal or real-time contexts.- For standard Vecs, you can pre-allocate capacity using
Vec::with_capacity()to minimise reallocations during runtime.
String: Owned, Growable Text
Section titled “String: Owned, Growable Text”In Rust, the String type is a way to handle textual data. It is used when the length of the text can change, or when you need to modify its contents. As a growable type, String can expand or shrink as needed, accommodating varying amounts of text without requiring you to pre-allocate an exact size. This is used for tasks like reading user input, parsing files, or building strings dynamically.
Being mutable, String allows in-place modifications such as appending new text, inserting characters, or replacing substrings. This mutability, combined with its growable nature, makes it ideal for dynamic text manipulation.
A critical characteristic of String is its UTF-8 encoding. This means it’s designed to correctly handle characters from different languages and scripts worldwide, ensuring that your text is represented accurately regardless of its origin. This is a significant advantage over simpler ASCII-based string representations, which can struggle with internationalisation.
Furthermore, String owns its data, which is a core concept in Rust’s memory safety model. This means that when a String variable is in scope, it is responsible for managing the memory allocated for its text data. This data is stored on the heap. When the String goes out of scope, its memory is automatically deallocated, preventing memory leaks and ensuring efficient resource management without the need for manual memory deallocation.
Rust String’s direct analogue is C++‘s std::string. Both types serve as the primary owned, growable, and mutable string types, handling dynamic text data with similar heap allocation and automatic memory management (through destructors in C++ and Drop in Rust). However, Rust’s String integrates with its strong ownership and borrowing system, providing compile-time guarantees against common string-related errors like use-after-free or double-free, which C++ developers must typically manage manually.
Example Creating Strings
Strings can be created from string literals using String::from(), as a new empty string with String::new(), or by formatting other values using the format! macro.
fn main() { // From a string literal let s1 = String::from("Hello, Rust!"); println!("s1: {}", s1); // Output: s1: Hello, Rust!
// New empty string let mut s2 = String::new(); s2.push_str("World"); // Add content println!("s2: {}", s2); // Output: s2: World
// Using format! macro let name = "Alice"; let age = 30; let s3 = format!("Name: {}, Age: {}", name, age); println!("s3: {}", s3); // Output: s3: Name: Alice, Age: 30}This code gives the output:
s1: Hello, Rust!s2: Worlds3: Name: Alice, Age: 30The format! macro in Rust is used to create a String by formatting arguments into a string literal. It works similarly to println! but instead of printing to the console, it returns a new String object. Think of it as a flexible way to build dynamic strings. You provide a format string (which can contain placeholders like {}) and then supply the arguments that will be inserted into those placeholders. Rust handles the conversion of your arguments into their string representation and combines them into a single String.
String type versus &str (String Slice)
Rust distinguishes between String (owned, mutable, heap-allocated) and &str (immutable, borrowed string slice). String literals are &str by default. Functions are often designed to accept &str to be more general, as Strings can be “Deref coerced” into &str when passed as arguments. Deref coerced means that when a function expects an &str and you pass it a &String (a reference to a String), the compiler will automatically “dereference” the &String into an &str. This happens because String implements the Deref trait, allowing it to behave like an &str in certain contexts.
The key points are:
&stris a string slice or a borrowed string. It is a reference to a contiguous sequence of UTF-8 encoded characters stored elsewhere, rather than an owner of the data.&stris always immutable. You cannot modify the text data through a&strreference.- The data it points to could be in various locations: a
Stringon the heap, part of aString, or even hardcoded into the program’s binary (like string literals). &stris essentially a “view” into string data, similar to how a pointer or reference allows you to access data without owning it in C/C++.
Here is an example of String and &str interaction:
fn print_text(text: &str) { // Function takes a string slice println!("Received: {}", text);}
fn main() { let owned_string = String::from("This is an owned string."); let string_literal = "This is a string literal."; // Call the function above with both: print_text(&owned_string); // Pass a reference to String print_text(string_literal); // Pass a string literal directly
// Creating a slice from an owned String let part_of_string = &owned_string[5..10]; // Slice from index 5 (inc) to 10 (exc) println!("Part of string: {}", part_of_string); // Output: Part of string: is an}This code gives the following output:
Received: This is an owned string.Received: This is a string literal.Part of string: is anSafe String Slicing and the UTF-8 Boundary Problem
The range syntax &s[i..j] slices at byte positions, not character positions. For strings that contain only ASCII characters this distinction does not matter, but for any string containing multi-byte characters (accented letters, CJK characters, emoji), slicing at an arbitrary byte index will panic at runtime if the index falls in the middle of a multi-byte character.
let s = String::from("héllo"); // 'é' is two bytes (0xC3 0xA9) in UTF-8let bad = &s[1..3]; // panics: byte 1 is inside the 'é' characterlet safe = &s[2..5]; // "llo" — starts at the byte after 'é'println!("{safe}");The safe way to work with characters is to use the chars() iterator, which yields char values (Unicode scalar values) regardless of their byte width:
let s = String::from("héllo");let third: char = s.chars().nth(2).unwrap(); // 'l' (third character)println!("{third}");
let first_three: String = s.chars().take(3).collect();println!("{first_three}"); // "hél"lhélWhen you need both the byte index and the character (for example, to produce a valid &str slice), use char_indices():
for (byte_pos, ch) in "héllo".char_indices() { println!("byte {byte_pos}: '{ch}'");}byte 0: 'h'byte 1: 'é'byte 3: 'l'byte 4: 'l'byte 5: 'o'Note that 'é' occupies bytes 1 and 2, so the next character starts at byte 3.
Useful Slice Methods
Both &[T] (general slices) and &str (string slices) expose a rich set of methods. The following are particularly useful in edge programming contexts.
| Method | Type | Description |
|---|---|---|
split_at(mid) | &[T] / &str | Splits into two non-overlapping halves at index mid |
chunks(n) | &[T] | Iterator over non-overlapping sub-slices of length n |
windows(n) | &[T] | Iterator over overlapping sub-slices of length n |
starts_with(prefix) | &[T] / &str | Returns true if the slice begins with prefix |
ends_with(suffix) | &[T] / &str | Returns true if the slice ends with suffix |
contains(&item) | &[T] | Returns true if any element equals item |
iter() | &[T] | Yields immutable references to each element |
The windows() method is particularly valuable for signal processing on edge devices: it produces every contiguous sub-slice of a given length, making it straightforward to implement a sliding-window moving average over a sensor reading buffer without allocating a new collection.
fn moving_average(readings: &[f32], window: usize) -> Vec<f32> { readings .windows(window) .map(|w| w.iter().sum::<f32>() / window as f32) .collect()}
fn main() { let temps = [21.1, 22.3, 21.8, 23.0, 22.5, 21.9]; let averages = moving_average(&temps, 3); for avg in &averages { print!("{avg:.2} "); } println!();}21.73 22.37 22.43 22.47The chunks() method divides a slice into fixed-size blocks and is useful when processing data that arrives in fixed-size packets, such as reading a frame buffer or handling batched ADC samples.
UTF-8 Encoding
Rust’s String and &str types are guaranteed to be valid UTF-8. This means they can correctly handle a wide range of international characters, including emojis, without additional effort. This is a significant advantage over C++‘s std::string, which typically stores raw bytes and does not enforce any particular encoding.
Rust Example: UTF-8 Characters
Short example to demonstrate the use of UTF-8 to say a message approximating “good day” in English, Japanese, Mandarin and Sanskrit.
fn main() { // English let greeting = String::from("Hello 👋"); println!("{}", greeting);
// Japanese let japanese = "こんにちは"; // "Hello" in Japanese (Konnichiwa) println!("{}", japanese);
// Mandarin Chinese let chinese = "你好"; // "Hello" in Mandarin (Nǐ hǎo) println!("{}", chinese);
// Sanskrit (iterate over Unicode characters) println!("Sanskrit characters:"); for c in "नमस्ते".chars() { // "Hello" in Sanskrit (Namaste) println!("{}", c); }}Will give the output:
Hello 👋こんにちは你好Sanskrit characters:नमस
तStrings for Edge Programming
While String uses heap allocation, which can be a concern for embedded systems, its UTF-8 guarantee simplifies text processing for diverse data sources. For scenarios where dynamic allocation must be avoided, &str (string slices) can be used for fixed-size text, or heapless::String (from the heapless crate) provides a fixed-capacity string that allocates on the stack or in static memory.
🎬Code Demo: String Types in Rust
Section titled “🎬Code Demo: String Types in Rust”A C++ comparison is useful in this demo because almost every C++ programmer has the same model (std::string and string_view) and can transfer most of their intuition.
`String` vs `&str` in Rust
HashMap<K, V>: Key-Value Storage
Section titled “HashMap<K, V>: Key-Value Storage”HashMap<K, V> is Rust’s canonical hash map implementation, from the std::collections module. It provides a highly efficient, general-purpose collection that stores key-value pairs (K and V respectively). It works in the following way: using mailboxes as an example:
- Each key (“Derek” as the mailbox owner) gets run through a hash function to produce a number representation.
- That number decides which “mailbox slot” the value goes into.
- When you look up “Derek”, Rust hashes the key again and jumps directly to the right slot.
There is no searching through the whole collection required; it jumps right to the correct place. Note: the key does not have to be a String; it can be any type (see type K below).
The Core Characteristics:
- A property of
HashMapis that it does not guarantee any particular order for its elements. The order in which elements are iterated or appear when printed can change between runs, or even within the same run if the map is modified. This is a consequence of its underlying hash table structure, which prioritises performance over ordering. - For insertions, deletions, and lookups (retrieving a value by its key),
HashMapprovides a strong average-case time complexity of amortised O(1)** (constant time). This means that, on average, these operations take a fixed amount of time regardless of the number of elements in the map.
- For a type
Kto be used as a key in aHashMap, it must implement theEqandHashtraits. TheHashtrait defines how to compute a hash value for the key, and theEqtrait defines how to check for equality between keys. It is necessary that if two keys are consideredEq, their Hash values must also be equal. Modifying a key in such a way that its hash or equality changes while it’s in the map can lead to undefined behaviour. - Security against HashDoS Attacks: By default, HashMap uses a randomly seeded hashing algorithm (like SipHash). This helps prevent “HashDoS” (Denial of Service) attacks, where a malicious actor could craft inputs that cause many keys to collide in the hash map, degrading performance to its worst-case O(N) and potentially crashing the application. The random seed ensures that the hash function varies between program executions, making it difficult for attackers to predict hash collisions. For performance-critical applications where trusted inputs are guaranteed, users can opt for faster hashers.
- While offering fast average-case performance,
HashMapgenerally consumes more memory than ordered map implementations (likeBTreeMap) due to its internal array structure and the need for a “load factor” (the ratio of elements to total capacity) to maintain performance. This often means some percentage of its allocated memory is empty to minimise collisions.
HashMap<K, V> is Rust’s direct counterpart to C++‘s std::unordered_map<K, V>. Both are hash table implementations designed for similar use cases, prioritising fast average-case performance over ordered iteration. While their underlying implementation details (e.g., collision resolution strategies, default hashers, resizing policies) may differ, their fundamental purpose and performance characteristics are analogous.
The following is a step-by-step set of instructions for working with a basic HashMap.
1. Creating a HashMap
To use HashMap, you first bring it into scope with use std::collections::HashMap;. You can then create a new, empty hash map using HashMap::new(), as seen in the following example:
use std::collections::HashMap;fn main() { // Create a new, empty HashMap with String key and i32 value let mut grades: HashMap<String, i32> = HashMap::new(); println!("Empty HashMap: {:?}", grades);}This gives the following output:
Empty HashMap: {}2. Adding and Accessing Elements
Elements are added using the insert() method. Values can be retrieved using the get() method, which returns an Option<&V> (an immutable reference to the value), or checked for existence with contains_key(). For conditional updates (e.g., “insert if not present”), the entry() API is more efficient than checking and then inserting.
use std::collections::HashMap;
fn main() { let mut grades: HashMap<String, i32> = HashMap::new();
// Insert key-value pairs (Name and % grade) grades.insert(String::from("Derek"), 99); grades.insert(String::from("Bob"), 70); grades.insert(String::from("Charlie"), 39);
// Using the entry API to insert only if the key doesn't exist grades.entry(String::from("David")).or_insert(85);
println!("Grades: {:?}", grades);
// Access a value let derek_grade = grades.get(&String::from("Derek")); match derek_grade { Some(grade) => println!("Derek's grade: {}", grade), None => println!("Derek not found."), }
// Check if a key exists if grades.contains_key(&String::from("David")) { println!("David is in the map."); } else { println!("David is not in the map."); }}This code gives the following output (note that HashMap order is not guaranteed):
Grades: {"Derek": 99, "Bob": 70, "Charlie": 39, "David": 85}Derek's grade: 99David is in the map.3. Iterating Over a HashMap
HashMaps can be iterated over their keys, values, or key-value pairs using methods like keys(), values(), and iter(). The order of iteration is not guaranteed.
fn main() { let mut grades: HashMap<String, i32> = HashMap::new();
// Insert key-value pairs grades.insert(String::from("Derek"), 99); grades.insert(String::from("Bob"), 70); grades.insert(String::from("Charlie"), 39);
// Iterate over key-value pairs for (name, grade) in &grades { println!("Student {} has grade {}%", name, grade); }
println!("\nNames (Keys) Only..."); for name in grades.keys(){ print!(" {}", name); }
println!("\nGrades (Values) Only..."); for grade in grades.values(){ print!(" {}", grade); } println!("");}Gives the following output:
Student Charlie has grade 39%Student Derek has grade 99%Student Bob has grade 70%
Names (Keys) Only... Charlie Derek BobGrades (Values) Only... 39 99 70HashMaps for Edge Programming
For edge programming, HashMap offers fast average-case lookups, which can be beneficial for data processing. However, its dynamic memory allocation and potential for hash collisions (leading to worst-case performance) might be considerations.
- HashMaps generally use more memory than ordered maps like BTreeMap due to their internal array structure.
- While
HashMapis usually faster, for very small sets of data (e.g., fewer than 30 items), a linear search on aVecmight even outperform it. If ordering is required, BTreeMap (Rust’s ordered map, similar tostd::map) is the appropriate choice. - For strict embedded environments, Rust allows custom allocators, enabling developers to use arena or pool allocators to manage HashMap’s memory more predictably, mitigating fragmentation.
🧩Knowledge Check
Section titled “🧩Knowledge Check”Match the Rust Collection Concepts
What happens if you use square brackets (e.g., data[10]) to access an out-of-bounds index in a Vec?
Which of the following is a key difference between String and &str?
In a HashMap, why are elements not returned in a guaranteed order during iteration?
What is the purpose of the HashMap entry() API?
How can you handle dynamic data in a memory-constrained embedded environment where heap allocation is forbidden?
© 2026 Derek Molloy, Dublin City University. All rights reserved.