StokumNET Security Roadmap: From Foundation to Enterprise-Grade
Building a secure multi-tenant platform is never "done." The architecture I described in my previous post established the foundation—zero-trust principles, defense in depth, tenant isolation. But security is a journey, not a destination.
This post documents where I'm headed next: the security capabilities I'm building into StokumNET and the tools and approaches I'm using. If you're building your own applications or evaluating security architectures, I hope this roadmap provides useful reference points.
The Philosophy
Before diving into specific technologies, let me share the principles guiding these decisions:
Shift left, but don't stop there. Security should start in the IDE and continue through production. Early detection is cheaper, but runtime protection is still essential.
Automate the boring stuff. Manual security reviews don't scale. Every check that can be automated should be automated—freeing human attention for the decisions that require judgment.
Make the secure path the easy path. If security adds friction, developers will work around it. The goal is security that enables rather than blocks.
Defense in depth, always. No single control is sufficient. Layers matter.
Observability & Threat Detection
Tamper-Proof Centralized Logging
Logs are your forensic trail. If an attacker can modify or delete logs, you lose visibility into what happened. I'm building a logging infrastructure with immutability as a core requirement.
Architecture:
Application Logs → Fluent Bit → Kafka (buffer) → OpenSearch
↓
Wazuh (SIEM/correlation)
↓
Alerting (Slack/Email)Component choices:
| Component | Tool | Why |
|---|---|---|
| Log Shipper | Fluent Bit | Lightweight, handles structured logs well |
| Buffer | Kafka or Redis Streams | Prevents log loss during ingestion spikes |
| Storage | OpenSearch | Free, scalable, good query capabilities |
| SIEM | Wazuh | Open-source, built-in rules, active community |
Tamper-proofing approaches:
Write-once storage policies prevent modification after ingestion. Log signing at the source provides cryptographic proof of integrity. Separate log infrastructure (different credentials, network segment) limits blast radius if the application is compromised.
Behavioral Anomaly Detection
Static rules catch known patterns. Anomaly detection catches the unknown.
I'm implementing detection for unusual access patterns (user accessing data at odd hours, from unusual locations), privilege escalation attempts, API abuse (excessive requests, parameter fuzzing), and authentication anomalies (credential stuffing patterns, impossible travel).
OpenSearch has built-in anomaly detection using machine learning. Wazuh provides correlation rules that can identify attack patterns across multiple log sources. The combination provides both signature-based and behavioral detection.
DevSecOps Pipeline
CI/CD Security Gates
Every commit triggers a pipeline. Every pipeline includes security checks. Nothing reaches production without passing.
Pipeline stages:
Commit → Lint → SAST → Unit Tests → Build → Image Scan → Deploy to Stage
↓
Integration Tests
↓
DAST Scan
↓
Manual Approval
↓
Production DeployTool selection (all free/open-source):
| Stage | Tool | Purpose |
|---|---|---|
| SAST | Semgrep | Fast, customizable rules, low false positives |
| SCA | Trivy, OWASP Dependency-Check | Known vulnerabilities in dependencies |
| Secrets | Gitleaks | Prevent credential leaks |
| Container | Trivy, Grype | CVEs in base images and packages |
| IaC | Checkov, KICS | Misconfigurations in Terraform/K8s |
| DAST | OWASP ZAP | Runtime vulnerability scanning |
Why these tools? I evaluated many options. Semgrep stands out for its balance of speed, accuracy, and customizability. Trivy handles multiple scan types (containers, IaC, SCA) in one tool. OWASP ZAP remains the gold standard for free DAST.
Code Signing
Every artifact should have cryptographic proof of origin. This prevents tampering and establishes chain of custody.
Implementation:
Git commits: GPG signing required. Every commit includes a signature verifying the author. Gitea can enforce signed commits on protected branches.
Container images: Cosign from the Sigstore project. Supports both key-based and keyless (OIDC-based) signing. Images are signed after build, verified before deployment.
Binaries: For any compiled artifacts distributed outside containers, GPG signatures or minisign provide lightweight verification.
Supply Chain Security
Your application is only as secure as your dependencies. Supply chain attacks are increasingly common—compromised packages, typosquatting, dependency confusion.
Defenses:
Dependency proxying: All packages flow through a local proxy (Nexus Repository OSS or Verdaccio for npm). This provides caching, audit trail, and the ability to block known-malicious packages.
SBOM generation: Every build produces a Software Bill of Materials using Syft. SBOMs in SPDX or CycloneDX format document exactly what's in each artifact.
Policy enforcement: OPA (Open Policy Agent) policies can block deployments that include packages with critical vulnerabilities or from untrusted sources.
Provenance: Working toward SLSA framework compliance. Build provenance attestations prove where artifacts came from and how they were built.
Shift-Left Security
IDE Integration
The cheapest place to fix a vulnerability is before the code is committed.
Tools in my development environment:
| Tool | Purpose | Integration |
|---|---|---|
| Semgrep | SAST | VS Code extension, real-time |
| SonarLint | Code quality + security | VS Code extension |
| Gitleaks | Secrets detection | Pre-commit hook |
Pre-commit hooks run Gitleaks and basic linting before any commit. If secrets are detected, the commit is blocked. This prevents the "oops, I committed my API key" scenario.
Infrastructure-as-Code Scanning
Misconfigurations are a leading cause of cloud breaches. Scanning IaC catches issues before they're deployed.
Tools:
Checkov and KICS scan Terraform, Kubernetes manifests, and Dockerfiles for security misconfigurations: overly permissive IAM policies, containers running as root, missing encryption settings, exposed services.
These run in the CI pipeline, but I also run them locally before committing infrastructure changes.
Data Protection
End-to-End Encryption (Optional Per-Tenant)
Some tenants need data that even I, as the platform operator, cannot read. This is a significant feature with significant trade-offs.
Architecture: DEK + KEK
Tenant Master Password
↓
[PBKDF2/Argon2]
↓
Key Encryption Key (KEK)
↓
[Encrypt/Decrypt]
↓
Data Encryption Keys (DEKs) — one per table/data category
↓
[Encrypt/Decrypt]
↓
Encrypted Data in PostgreSQLHow it works:
Each tenant who opts in provides a master password. This password derives a Key Encryption Key (KEK) using a strong KDF (Argon2 or PBKDF2 with high iterations). The KEK encrypts the tenant's Data Encryption Keys (DEKs). DEKs encrypt the actual data. The KEK is never stored—it exists only in memory during an authenticated session.
The trade-off is explicit: If a tenant loses their master password, their data is unrecoverable. No backdoors. No "forgot password" recovery. This is the point—true end-to-end encryption means the platform operator cannot access the data.
Implementation considerations:
Not all data can be encrypted this way. Fields used in queries (tenant ID, timestamps for filtering) must remain searchable. The application needs careful design to encrypt sensitive fields while maintaining functionality.
I'm implementing this as an opt-in feature. Tenants who enable it will see clear warnings about the risks. For tenants who don't need this level of protection, standard encryption at rest (database-level) provides good security without the recovery risk.
Key Rotation
Keys should rotate periodically. The DEK+KEK architecture supports this: rotating a KEK only requires re-encrypting the DEKs, not all the data. DEK rotation requires re-encrypting data, but can be done incrementally.
Application Hardening
Security Headers
Browser-level defenses add another layer of protection. I've already implemented the cookie attributes; these headers are next:
| Header | Purpose | Status |
|---|---|---|
| SameSite, HttpOnly, Secure | Cookie protection | ✅ Implemented |
| Content-Security-Policy | XSS mitigation | 🔄 In Progress |
| Subresource Integrity | CDN tampering protection | 📋 Planned |
| Referrer-Policy | Information leakage prevention | 📋 Planned |
| Permissions-Policy | Feature restriction | 📋 Planned |
| X-Content-Type-Options | MIME sniffing prevention | 📋 Planned |
| X-Frame-Options | Clickjacking prevention | 📋 Planned |
CSP is the most impactful but also the most complex. A strict CSP that breaks functionality is worse than no CSP. I'm implementing incrementally: report-only mode first, then enforcement after validating no legitimate functionality breaks.
API Security
Beyond the JWT authentication already in place:
Rate limiting: Per-user and per-IP limits prevent abuse. Implemented at the ingress controller level.
Request validation: Every API endpoint validates input against a schema. Invalid requests are rejected before reaching business logic.
Output encoding: All responses properly encode data to prevent injection in clients.
AI Integration
AI-Assisted Security
I'm exploring how LLMs can augment (not replace) security practices:
Threat modeling: Given architecture documentation, an LLM can generate initial threat models—identifying assets, threat actors, attack vectors. Human review refines these, but AI provides a starting point.
Code review: LLMs can review pull requests for security issues. Not as a gate, but as an additional reviewer that never gets tired and checks every line.
Vulnerability triage: When scanners produce hundreds of findings, AI can help prioritize: which are false positives, which are exploitable, which should be fixed first.
Documentation: Generating security documentation, runbooks, and incident response procedures from existing materials.
Business Intelligence
Beyond security, AI can help tenants:
Anomaly detection in transactions: Identifying unusual patterns that might indicate errors or fraud.
Demand forecasting: Predicting inventory needs based on historical patterns.
Natural language queries: "Which products sold more this quarter than last year?" without writing SQL.
These features require careful privacy consideration. Tenant data used for AI must be isolated; models trained on one tenant's data cannot leak to another.
Stress Testing & Resilience
Load Testing
Tool: k6
k6 is modern, scriptable in JavaScript, and produces clear reports. I'm building test scenarios for normal load (expected daily traffic), peak load (10x normal), stress testing (finding the breaking point), and soak testing (sustained load over time).
What I'm measuring: Response times (p50, p95, p99), error rates, resource utilization, and degradation patterns.
Chaos Engineering
Once the system handles expected load, I'll introduce controlled failures:
Network partitions: What happens when the database is temporarily unreachable?
Resource exhaustion: How does the system behave when memory or CPU is constrained?
Dependency failures: What if the secrets manager is down?
Tools like Litmus or simple scripts can inject these failures in staging. The goal is verifying graceful degradation—the system should fail safely, not catastrophically.
Implementation Timeline
This is a roadmap, not a promise. Real-world implementation takes time.
Phase 1 (Current): CI/CD pipeline with basic security gates (Semgrep, Trivy, Gitleaks)
Phase 2: Centralized logging with Wazuh, security headers hardening
Phase 3: Code signing, supply chain security improvements
Phase 4: Optional E2E encryption for tenants
Phase 5: AI integration, advanced anomaly detection
I'll write detailed implementation posts as I complete each phase.
Why Document This Publicly?
Three reasons:
Accountability. Writing down the plan makes it real. Public commitment creates motivation to follow through.
Community. Others building secure applications might find this useful. I've learned from countless blog posts; this is paying it forward.
Transparency. If you're evaluating StokumNET for your business, you deserve to know how security is being approached. Not just marketing claims—the actual architecture and roadmap.
Conclusion
Security is a process, not a product. The foundation I've built—zero-trust architecture, defense in depth, tenant isolation—provides a solid base. The roadmap in this post describes how I'm building on that foundation.
Every feature here serves the same goal: making StokumNET trustworthy enough that businesses can rely on it for their operations. Trust is earned through consistent, thoughtful security practices—not through promises.
Follow along as I implement these features. I'll share the successes, the failures, and the lessons learned along the way.
Questions about anything in this roadmap? Suggestions for tools or approaches I should consider? Reach out—I'd love to hear from others working on similar challenges.
Member discussion