Chaos Engineering has proven to be a powerful approach for teaching system administration by exposing students to realistic failure scenarios under controlled conditions. In 2025, we introduced a Chaos-Education platform built on FreeBSD jails that combines fault injection, gamification, and real-world Unix administration tasks.
After multiple semesters of practical use, the system was fundamentally redesigned to address scalability limitations, architectural coupling, and extensibility concerns.
This talk presents Updated Chaos Education, an evolved architecture that separates orchestration, execution, monitoring, and presentation layers while preserving the didactic principles of the original system. We describe the motivations behind the redesign, outline the new system architecture, and discuss its impact on teaching, scalability, and security. Finally, we present our vision for opening the platform as a community-driven FreeBSD-based chaos learning framework.
Teaching Unix and operating system fundamentals effectively requires more than theoretical exercises or static lab assignments. Students must learn to operate, debug, and recover real systems under pressure—skills that are increasingly difficult to assess using traditional coursework, especially in an era of widespread AI-assisted solutions.
The original Chaos-Education platform addressed these challenges by introducing Chaos Engineering principles into FreeBSD education. Students were confronted with intentionally broken systems and tasked with diagnosing and fixing faults under realistic conditions. Gamification elements, such as a public highscore leaderboard, increased motivation and engagement.
While the concept proved successful, real-world deployment in teaching environments exposed architectural weaknesses that limited further growth. Updated Chaos Education represents a response to these lessons learned.
The initial system was intentionally designed as a monolithic prototype to validate the educational concept. Over time, several challenges emerged:
Tight coupling between controller logic, scenario execution, and student interaction
Limited scalability for larger student cohorts or parallel scenarios
Security concerns arising from shared responsibilities within single components
High maintenance effort when extending scenarios or adapting the system
These issues did not undermine the educational value but imposed practical limits on further development. The updated architecture aims to remove these constraints while maintaining full compatibility with the original teaching model.
The redesign was guided by the following goals:
Strict separation of concerns between orchestration, execution, monitoring, and presentation
Improved scalability without increasing system complexity for instructors
Enhanced fault isolation between students, scenarios, and infrastructure
Extensibility for new scenarios and teaching formats
Preparation for open-source collaboration
At a high level, the updated Chaos-Education platform consists of the following components:
System Controller – central orchestration and instructor interface
Student Jails – isolated environments assigned to individual students or groups
Cluster Controller Jail – scenario orchestration and coordination
Monitor Jail – metrics collection and progress tracking
Webserver Jail – public highscore and result presentation
Scenario Database – storage and retrieval of scenario definitions
Each component runs in a dedicated FreeBSD jail, enforcing isolation and reducing blast radius in case of failure.
A key improvement is the explicit separation between student and lecturer roles:
Lecturers directly operate the System Controller and initiate scenarios.
Students interact exclusively with their assigned student jail via SSH.
No student has access to orchestration, monitoring, or scoring components.
This clear boundary improves security, prevents accidental interference, and aligns the learning experience with real-world access control models.
In the updated design, scenario logic is no longer embedded in the System Controller. Instead, a dedicated Cluster Controller Jail handles:
requesting scenario definitions from the scenario database
distributing fault instructions to student jails
tracking scenario state and completion
Communication between student jails and the cluster controller uses HTTPS-based polling and replies. This asynchronous model reduces coupling and allows scenarios to evolve independently from the core controller.
By decoupling orchestration from execution, the system gains several advantages:
student jails can be scaled horizontally
multiple scenarios can be executed in parallel
faults injected into student systems cannot affect orchestration services
This design mirrors production-grade distributed systems and reinforces Chaos Engineering concepts implicitly through the platform itself.
Monitoring is handled by a dedicated Monitor Jail, which aggregates metrics such as:
login and activity timestamps
scenario completion status
performance indicators for scoring
The Webserver Jail hosts the public highscore list, which is strictly read-only from the student perspective. This preserves the motivational aspect of gamification while preventing manipulation.
From an educational perspective, the updated architecture enables:
more complex, multi-layered failure scenarios
parallel exercises within a single lab session
improved reproducibility between teaching runs
reduced operational overhead for instructors
Students continue to experience realistic FreeBSD systems with full administrative access, while instructors benefit from a more robust and maintainable platform
Several insights emerged during the redesign:
Educational prototypes must evolve into maintainable systems if they are to be used long-term.
Architectural clarity directly improves teaching reliability.
Isolation is not only a security feature but also a didactic one—it prevents shortcuts and reinforces proper operational thinking.
Chaos Engineering principles apply equally to teaching infrastructure as to production systems.
With the architectural refactoring complete, the platform is now suitable for external contributions. We explicitly invite the FreeBSD and education communities to contribute in the following areas:
new fault and recovery scenarios
scenario templates and teaching modules
orchestration improvements
monitoring and visualization enhancements
documentation and onboarding material
Our long-term goal is to establish Chaos Education as an open, FreeBSD-native learning platform for teaching system resilience, troubleshooting, and operational thinking.
Updated Chaos Education represents a significant evolution from last year’s system. By separating responsibilities, improving scalability, and strengthening isolation, the platform has matured from a teaching prototype into a robust foundation for long-term use and collaboration.
With the core architecture stabilized, the next step is to open the system to the community and explore new teaching formats, scenarios, and integrations.
We invite educators, FreeBSD users, and Chaos Engineering practitioners to contribute—and help students learn by breaking systems safely.