Secure SDLC for Embedded Projects

MuhammadMuhammadEmbedded Security6 days ago8 Views

A vulnerability found in production firmware costs ten to a hundred times more to fix than the same vulnerability found during design or code review. For embedded devices already in the field, a firmware update may require physical technician access, OTA (Over-the-Air) update infrastructure, regulatory resubmission or a product recall. The secure SDLC (Software Development Lifecycle) for embedded systems is the discipline that catches vulnerabilities before they ship, by integrating security at every stage from requirements through deployment. This article covers the complete process: writing security requirements that are verifiable rather than aspirational, applying STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) threat modeling to embedded system architecture, running security-focused code reviews and selecting the right test methods, hardening deployment with a structured provisioning process and, finally, maintaining the documentation that keeps a shipped product defensible across its full service life.

Why Security Planning Must Start at Day One

The cost of fixing a security defect scales exponentially with how late in the development lifecycle it is found. A NIST (National Institute of Standards and Technology) study on software defect correction costs found that a defect fixed during design costs roughly 1x. The same defect found in code review costs 10x. Found during integration testing: 25x. Found post-release: 30x to 100x. For embedded devices, post-release costs are amplified further by: OTA infrastructure requirements, regulatory re-testing obligations (FCC, CE, FDA for medical devices), potential device recalls and reputational damage from disclosed CVEs (Common Vulnerabilities and Exposures).

Beyond cost, late security additions produce worse outcomes than designed-in security. A communication layer that was not designed with authentication in mind is difficult to retrofit with mutual TLS without disrupting the protocol state machine. A firmware update path that was not designed with signature verification cannot simply have verification added without changing the binary format that production devices already expect. Security that is designed in from the start is architecturally coherent. Security bolted on at the end is a patchwork of workarounds.

Secure SDLC for embedded systems formalises this principle into a repeatable process. The four phases where security work happens are: requirements (define what security the system must provide), design and threat modeling (verify the architecture addresses all identified threats), development and review (verify the code implements the design correctly), and deployment (verify the deployed device matches the intended security configuration).

Writing Verifiable Security Requirements

A security requirement that cannot be tested is not a requirement: it is a wish. “The device shall be secure” tells you nothing about what to build or whether you have built it. Every security requirement must be specific enough to have a binary pass/fail acceptance criterion that can be verified by a test procedure.

The contrast between a weak and a strong security requirement for the same protection goal:

Weak Requirement Strong Requirement Acceptance Criterion
The device shall use encryption for network communication. All network communication between the device and cloud backend shall use TLS 1.2 or 1.3 with mandatory server certificate verification. Connections with invalid or self-signed certificates shall be rejected. Wireshark capture of device traffic shows no plaintext connections. Connecting to a server with a self-signed cert causes the device to refuse the connection and log an error.
The firmware update process shall be secure. The device shall reject any firmware image whose ECDSA P-256 signature does not verify against the provisioned root public key. Rejection shall leave the running firmware unchanged and log a security event. Delivering a firmware image with a modified byte causes rejection. Delivering an image signed with a different key causes rejection. The existing firmware continues to run after rejection.
Default passwords shall not be used. The device shall ship with a unique per-device password derived from a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator) at factory provisioning. The password shall not be present in the firmware binary. Two production devices have different passwords. Strings extraction of the firmware binary does not reveal the password. Password entropy is at least 128 bits.
The device shall protect sensitive data. Encryption keys and API credentials shall be stored in the device’s encrypted NVS partition or secure element. They shall not appear in plaintext in any flash region readable at RDP Level 1. Reading flash contents at RDP Level 1 reveals no plaintext credentials or key material. Strings extraction of a flash dump does not produce API credentials.

The Requirements Gathering Process

Security requirements for an embedded device come from three sources that must be addressed together:

Asset identification: List everything the device must protect: the firmware itself, cryptographic keys, user credentials, personally identifiable data (PII) the device collects or transmits, the device’s availability (it must perform its function), and the integrity of its outputs (sensor readings, actuator commands). Each asset becomes a protection goal that drives specific requirements.

Threat identification: For each asset, consider who might want to compromise it and how. A firmware signing key is an asset: a threat to it is an attacker compromising the build server to exfiltrate the key. A temperature reading is an asset: a threat is a spoofed sensor injecting false data to manipulate an HVAC system. This is done systematically through threat modeling (see below).

Regulatory and compliance obligations: Depending on your market and product category, specific security requirements may be mandated by regulation. The EU Cyber Resilience Act (in force from 2027) requires IoT devices to have unique factory credentials, a vulnerability disclosure policy, security updates for the expected product lifetime, and no known actively exploited vulnerabilities at time of market placement. The UK PSTI (Product Security and Telecommunications Infrastructure) Act 2023 has broadly similar requirements with a ban on default passwords. The FDA’s 2023 cybersecurity guidance for medical devices requires a bill of materials for third-party software and a plan for monitoring and patching vulnerabilities. Map your requirements to these obligations explicitly so compliance can be demonstrated through audit.

The Six Categories of Embedded Security Requirements

Six categories cover the full scope of security requirements for a connected embedded device. Every device needs at least one requirement in each category. More complex or higher-risk devices will have multiple requirements per category.

Authentication

How do entities (devices, users, backend services) prove their identity? Requirements specify the authentication method (certificate-based, PSK, username/password), the strength of the credential (key length, password entropy), the storage of credentials (secure element, encrypted flash, no hardcoded values) and what happens on authentication failure (connection refused, rate limiting, lockout after N failures).

Authorisation

What is each authenticated entity permitted to do? Requirements specify which commands or data endpoints are accessible to which roles, whether topic or resource access is restricted per device, whether rate limits apply to API calls, and whether certain operations require elevated privilege (a firmware update requires a different credential than a configuration read).

Data Protection

What data must be encrypted at rest and in transit, with what algorithms, and what key sizes? Requirements specify encryption requirements per data category, the key storage mechanism, the cipher mode, and how data is classified (PII, credentials, operational data, telemetry).

Audit Logging

What security-relevant events must be recorded, where, in what format, and for how long? Requirements specify the event types (authentication attempts, command execution, firmware updates, tamper events, errors), the log storage mechanism (local flash ring buffer, forwarded to cloud SIEM (Security Information and Event Management)), the minimum retention period, and whether logs are tamper-evident (HMAC-authenticated).

Update Mechanism

How is firmware updated securely? Requirements specify the signature verification algorithm, the key management process, whether anti-rollback is enforced, how partial update failures are handled, and whether updates are authenticated at the transport layer as well as the image layer.

Incident Response

What does the device do when a security event occurs? Requirements specify the tamper response levels (alert, lockdown, key zeroization), the device behaviour after detecting compromised firmware, the process for remotely quarantining a compromised device, and the vulnerability disclosure contact and response SLA (Service Level Agreement) published to the security research community.

Threat Modeling for Embedded Systems

Threat modeling is the systematic process of identifying what can go wrong with a design before building it. It produces two outputs: a prioritised list of threats that must be mitigated, and an explicit record that specific threat scenarios were considered and addressed (or consciously accepted) in the design. This record is valuable for compliance audits, security reviews and future design decisions.

The threat modeling process for an embedded system has four steps: draw the system, identify assets, enumerate threats using a framework, and prioritise by risk. The entire process for a typical IoT device takes two to four hours with a team of three to five people (firmware engineer, hardware engineer, cloud/backend engineer, product manager). The time investment pays for itself if it catches a single design-level security issue that would otherwise have required a field firmware update.

Step 1: Draw the System

Create a data flow diagram (DFD) that shows all components, all communication paths and all trust boundaries. A trust boundary is a line across which data passes from a higher-trust to a lower-trust context, or vice versa. Common trust boundaries in an embedded IoT system:

  • The boundary between the internet and the device’s network interface
  • The boundary between the application layer and the bootloader
  • The boundary between the main application and the secure element
  • The boundary between the cloud backend and the device
  • The boundary between user input (physical buttons, web interface) and firmware logic
  • The boundary between the device and any peripheral sensors or actuators connected over I2C, SPI or UART

Every communication path that crosses a trust boundary is a potential attack vector and generates threat enumeration work.

Step 2: Identify Assets

For each component in the DFD, list the assets it holds or processes. An asset is anything whose compromise would cause harm: confidentiality harm (exposure of PII or credentials), integrity harm (incorrect sensor readings causing a safety incident) or availability harm (device unable to perform its function). Every identified asset must have at least one requirement in the security requirements document that protects it.

Applying STRIDE to an Embedded Device

STRIDE is the most widely used threat enumeration framework for embedded and IoT systems. Each letter represents a threat category. Applying STRIDE systematically to each component and each trust boundary crossing in your DFD ensures coverage across all major threat types.

Threat Definition Embedded Example Standard Mitigations
Spoofing Impersonating a legitimate entity Attacker sends MQTT messages claiming to be a legitimate device by using a stolen device ID Mutual TLS with per-device certificates, challenge-response authentication
Tampering Modifying data, code or configuration without authorisation Attacker modifies firmware binary on the OTA server before device downloads it Firmware signing, HMAC-authenticated configuration, flash write protection
Repudiation Denying that an action was performed Attacker issues a malicious command and later claims the device acted autonomously Tamper-evident audit log, signed command receipts, server-side logging of all device commands
Information Disclosure Exposing data to unauthorised parties Firmware binary extracted from flash contains hardcoded API credentials Secure element key storage, flash read protection, no hardcoded credentials
Denial of Service Making the device unavailable Attacker floods device MQTT connection with malformed packets, exhausting the packet buffer and halting operation Rate limiting, watchdog timer, input size validation, connection limits
Elevation of Privilege Gaining access beyond what is authorised Exploiting a buffer overflow in the UART command handler to execute arbitrary code with bootloader-level access Input validation, MPU configuration, stack canaries, secure boot

Work through each row of the STRIDE table for each trust boundary crossing in your DFD. A typical embedded IoT device with five to seven trust boundary crossings and six STRIDE categories generates thirty to forty-two threat scenarios to evaluate. Not all of them will have non-trivial mitigations, but the process ensures none are missed by oversight.

Worked STRIDE Example: OTA Update Path

Take the OTA (Over-the-Air) firmware update path as an example trust boundary: the device downloads a firmware image from a cloud server and installs it. Applying STRIDE:

  • Spoofing: Can the update server be impersonated? Mitigation: certificate pinning to the update server’s public key, so a MITM server is rejected even if it has a valid CA-signed certificate.
  • Tampering: Can the firmware image be modified in transit or on the server? Mitigation: firmware image signed at build time with an offline key; device verifies signature before writing a single byte to flash.
  • Repudiation: Can the device later claim it never received an update? Mitigation: device logs firmware update events (version number, timestamp, outcome) to the tamper-evident audit log and acknowledges receipt to the server.
  • Information Disclosure: Does the update process expose any sensitive data? The download channel is TLS; the firmware image itself should not contain credentials (mitigated by secure provisioning practices).
  • Denial of Service: Can the update process be abused to exhaust device resources? Mitigation: image size validation before download begins, download timeout, watchdog timer during flash write operations.
  • Elevation of Privilege: Can a malicious firmware image escalate to bootloader context? Mitigation: anti-rollback counter prevents installing older firmware; signature verification rejects unsigned images; bootloader flash region write-protected.

Attack Trees

Attack trees complement STRIDE by modelling the full set of paths an attacker might take to achieve a specific goal, rather than cataloguing threat categories per component. An attack tree has a root node representing the attacker’s objective (for example, “execute arbitrary code on the device”) and child nodes representing the sub-goals or conditions required to achieve it. AND nodes require all child conditions to be true. OR nodes require any one child condition.

A simplified attack tree for “execute arbitrary code on a deployed embedded device” might look like this:

GOAL: Execute arbitrary code on device
│
├── OR: Exploit network-accessible vulnerability
│   ├── Buffer overflow in MQTT packet parser (unauthenticated)
│   ├── Command injection in HTTP management interface
│   └── Exploit deserialization bug in JSON config parser
│
├── OR: Physical access exploitation
│   ├── AND: Access JTAG/SWD interface
│   │   ├── Physical access to PCB
│   │   └── JTAG not locked (RDP Level 0)
│   ├── Desolder flash and read/modify with programmer
│   └── AND: Voltage glitch to bypass secure boot check
│       ├── Physical access to power supply
│       └── Timing window in bootloader exploitable
│
└── OR: Compromise supply chain
    ├── Replace firmware on OTA server
    └── Compromise build pipeline and inject malicious code

The attack tree reveals which mitigations have the most impact. Locking JTAG and enabling secure boot closes the entire physical JTAG branch and the voltage glitch branch (assuming the glitch mitigation is implemented correctly in the bootloader). Input validation on the MQTT parser closes the network exploit branch for that vector. Each mitigation pruned from the tree increases the attacker’s cost and reduces the probability of successful exploitation.

Risk Prioritization: Likelihood Times Impact

Not all threats require equal mitigation effort. A risk prioritisation framework ensures that development effort is directed toward the threats that represent the highest actual risk to the product and its users.

The standard model scores each threat on two dimensions and multiplies them to produce a risk level:

  • Likelihood: How probable is it that an attacker will attempt and succeed with this attack? Factors: how many attackers have the required skill, whether attack tools are publicly available, how much physical access is required, whether the device is internet-facing.
  • Impact: What is the worst-case consequence if the attack succeeds? Factors: safety consequences, financial loss, data exposure scope, reputational damage, regulatory penalty, loss of device fleet control.

Score both on a 1-to-5 scale. Likelihood × Impact produces a risk score from 1 to 25. Anything scoring 15 or above requires a mitigation before release. Scores of 10 to 14 require a mitigation or a documented and approved acceptance decision. Scores below 10 may be accepted with monitoring.

For example: an unauthenticated buffer overflow in a UART command handler on a device that is never physically accessible to end users scores Likelihood 2 (requires physical access, low attacker pool) × Impact 5 (arbitrary code execution, full device compromise) = 10. The same vulnerability in an internet-accessible REST endpoint scores Likelihood 5 × Impact 5 = 25. Same vulnerability class, entirely different risk levels because of the attack vector difference.

Mitigation Strategies

For each threat that scores above the acceptance threshold, choose one of four responses:

  • Eliminate: Remove the vulnerable feature entirely. If the UART command handler is only used during development, disable it in production builds. No handler means no attack surface.
  • Mitigate: Add controls that reduce likelihood or impact to an acceptable level. Add input length validation and bounds checking to the handler. Add authentication to require a credential before any command is processed.
  • Transfer: Move the risk to a third party through insurance, a warranty clause or a contractual obligation on a supplier. Applicable for systemic risks (supply chain attacks on semiconductor vendors) that cannot be addressed through device design alone.
  • Accept: Document that the risk was identified, scored, and a conscious decision was made to accept it without further mitigation. Acceptable only for low-scoring residual risks after other mitigations are applied. Acceptance must be signed off by a product owner, not declared unilaterally by the development team.

The output of threat modeling is a threat register: a living document that records each identified threat, its STRIDE category, its risk score, the chosen response, the mitigation control implemented, and the test that verifies the mitigation is working. This document is reviewed at each major release and updated when new threats are identified through vulnerability disclosure or incident response.

Security-Focused Code Reviews

A code review process that catches security issues requires three things that ordinary functional code reviews do not always provide: reviewers with security-specific knowledge, a structured checklist that ensures consistent coverage across all security-relevant categories, and a policy that security findings block merge rather than being logged as future work.

Security Review Checklist for Embedded C

Apply this checklist to every pull request that touches security-relevant code paths: input handlers, authentication logic, firmware update processing, cryptographic operations, memory management and configuration parsing.

Category What to Check Common Finding
Input Validation All external inputs validated for type, range and format before use Length field from network packet used as array index without bounds check
Buffer Handling No banned functions (strcpy, sprintf, gets); all array writes bounded by actual size strncpy used correctly but null terminator not explicitly added
Return Value Checking All security-relevant function return values checked; no silent failures mbedtls_ssl_handshake() return value discarded; code proceeds regardless of outcome
Authentication Authentication checks are fail-secure; any non-success result denies access Error return from authentication function falls through to access-granted path
Cryptography AEAD mode used; no ECB; IVs not reused; established library functions used AES-CBC used without MAC; IV is a constant byte sequence
Secrets Management No hardcoded credentials; sensitive buffers cleared after use with compiler-safe zero WiFi password stored as a global char array initialised at compile time
Error Handling Error paths do not leave partial state; errors logged with sufficient context Failed malloc leaves a half-initialised struct in a global; subsequent code dereferences it
Integer Safety Arithmetic on user-controlled values pre-checked for overflow before operation packet_length + HEADER_SIZE overflows uint16_t when packet_length is near 65535

Review Workflow

Security findings from code review should be tracked in the same issue tracker as functional bugs, with severity labels that map to your risk scoring framework. A finding equivalent to a risk score of 15 or above is a blocking issue: the pull request cannot be merged until it is resolved or until a documented risk acceptance decision is made by the appropriate product owner.

For teams without dedicated security engineers, the most effective approach is to designate a rotating “security reviewer” role for each sprint: one developer whose review responsibility in that sprint includes the security checklist in addition to normal functional review. This distributes security knowledge across the team more effectively than making it one person’s exclusive responsibility.

Static Analysis Tools for Embedded C

Static analysis examines source code or compiled binaries without executing them, applying pattern matching and dataflow analysis to find bug classes that are difficult to catch through manual review. For embedded C, the most productive categories of finding are: use of banned functions, unvalidated user input reaching memory operations, uninitialized variables, integer overflow in size calculations and discarded return values from security-relevant functions.

# Running Cppcheck on an embedded C codebase.
# Cppcheck is open source and widely supported in CI/CD pipelines.
# Configure it to treat certain findings as errors to block builds.

# Basic scan with all checks enabled, outputting to XML for CI integration
cppcheck \
  --enable=all \
  --std=c11 \
  --platform=arm32-wchar_t4 \
  --xml \
  --xml-version=2 \
  --suppress=missingIncludeSystem \
  --error-exitcode=1 \
  src/ \
  2> cppcheck_report.xml

# Specific checks most relevant to firmware security:
# --check-library: check proper use of library functions
# --enable=performance: find inefficiencies that often co-occur with bugs
# The following additional flags are important for embedded security work:

# Check for function pointers used without null check
cppcheck --enable=warning src/command_handler.c

# Check for integer overflow potential
cppcheck --enable=portability src/packet_parser.c

# For GCC-based toolchains, also add to the build:
# -fanalyzer: GCC's built-in static analyser (GCC 10+)
# Reports: null dereferences, use-after-free, buffer overflows
# Add to CFLAGS in your build system: -fanalyzer -Wanalyzer-too-complex

Beyond Cppcheck, the following tools are commonly used in professional embedded firmware security reviews:

  • Clang Static Analyzer: Part of the LLVM toolchain. Finds null dereferences, use-after-free, dead code and security-relevant patterns. Integrates with CMake via scan-build.
  • Coverity Static Analysis: Commercial tool with the most comprehensive C vulnerability database. Used by many industrial IoT vendors for pre-release certification. Expensive but produces very low false positive rates.
  • CodeChecker: Open source front-end that aggregates results from Clang Static Analyzer, Cppcheck and other tools into a unified dashboard with suppression management.
  • Semgrep: Rule-based pattern matching. Particularly effective for custom rules targeting your specific codebase patterns, such as detecting use of your internal APIs without the required authentication check.

Static analysis produces false positives. The goal is not zero findings but zero unreviewed findings above your severity threshold. Establish a suppression workflow where false positives are marked with a justification comment, so that re-running the analysis does not re-raise the same noise. True positives above the threshold block the build. True positives below the threshold are filed as technical debt issues with a scheduled remediation sprint.

Dynamic Testing and Fuzzing

Dynamic testing exercises the running firmware rather than examining source code. It finds bugs that static analysis cannot: race conditions, runtime memory corruption in execution paths that are not apparent from source code, and protocol state machine violations that only manifest under specific message sequences.

Fuzzing Embedded Firmware

Fuzzing feeds malformed, boundary-case or randomly mutated inputs to the firmware and monitors for crashes, hangs or unexpected behaviour. For embedded targets, there are two practical approaches:

Host-based fuzzing with a firmware simulation layer: Recompile the protocol parsing and application logic for Linux/x86, providing stub implementations of hardware-specific functions. Run AFL++ (American Fuzzy Lop, successor) or libFuzzer against the compiled binary. This approach achieves high throughput (millions of test cases per hour) and provides access to AddressSanitizer and UndefinedBehaviorSanitizer, which detect memory errors and integer overflows that would be silent on the actual hardware target.

/* Fuzzing harness for an MQTT packet parser.
   Compile for host with: clang -fsanitize=address,undefined -g -o fuzz_mqtt
                          -DFUZZ_BUILD mqtt_parser.c fuzz_mqtt.c
   Then run with: ./fuzz_mqtt FUZZ_CORPUS/ -max_len=1024 -jobs=4

   The LLVMFuzzerTestOneInput function is called by libFuzzer with
   each generated test case. It returns 0 always; crashes are detected
   by the sanitizer, not by return values. */

#include 
#include 
#include "mqtt_parser.h"   /* Your actual parser header */

/* Fuzzing entry point called by libFuzzer */
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    /* Guard: reject trivially small inputs that cannot form a valid packet */
    if (size < 2) return 0;

    /* Parse the fuzz input as though it arrived over the network.
       The parser should handle all inputs without crashing.
       AddressSanitizer will detect any out-of-bounds access. */
    MqttParseResult result = mqtt_parse_packet(data, size);

    /* Optionally exercise the dispatch path for parsed packets.
       Stub out any hardware calls that would be invoked. */
    if (result == MQTT_PARSE_OK) {
        mqtt_dispatch_stub();
    }

    /* Tell libFuzzer that this input produced a valid parse.
       This hint improves corpus generation for valid-packet paths. */
    return 0;
}

On-device fuzzing over a test interface: For parsers that depend heavily on hardware state, use a UART or USB test interface to feed fuzz inputs directly to the running firmware on hardware. Boofuzz and Sulley are Python-based protocol fuzzing frameworks that can generate structured-field mutations for known protocols. This approach runs at lower throughput but tests the actual hardware target including timing-dependent behaviour.

AddressSanitizer for Embedded Linux

On embedded Linux targets (Raspberry Pi, i.MX6, AM335x), AddressSanitizer (ASan) can be built into the firmware application binary. ASan instruments every memory access and detects: heap buffer overflow, stack buffer overflow, use-after-free, use-after-return and memory leaks, all at runtime with the actual firmware logic running on real hardware. Compile with -fsanitize=address -g using a cross-compiler that supports ASan for your target architecture (GCC 4.8+ and Clang 3.1+ support ASan on ARM).

Regression Testing Security Controls

Security controls must be regression-tested alongside functional behaviour. A test suite that verifies functional correctness but never exercises security controls will not catch a code change that accidentally disables them. Write explicit test cases for each security requirement. Examples:

  • A test that delivers an unsigned firmware image to the OTA handler and asserts it is rejected.
  • A test that presents an expired TLS certificate and asserts the connection is refused.
  • A test that sends an oversized packet to the MQTT parser and asserts no crash and a correct error return.
  • A test that reads flash contents via the debug interface at RDP Level 1 and asserts no credentials are visible in plaintext.

Penetration Testing Embedded Devices

Penetration testing combines the tools and techniques of an actual attacker with the goal of finding exploitable vulnerabilities before a real attacker does. For embedded devices, a penetration test covers three areas: network-facing attack surface, physical attack surface and firmware analysis.

A basic firmware penetration test workflow:

# Basic embedded firmware security assessment workflow.
# Run on a dedicated security assessment machine, not a production system.

# Step 1: Firmware extraction
# Attempt extraction via the update channel first (may be publicly distributed)
# If not available, try JTAG/SWD (should be locked in production)
# If still not available, physical flash chip reading via programmer

# Step 2: Analyse the firmware binary with binwalk
pip install binwalk --break-system-packages
binwalk -e firmware.bin

# Step 3: Check for embedded credentials and interesting strings
strings firmware.bin | grep -Ei "password|passwd|secret|key|token|api|admin"
strings firmware.bin | grep -E "([0-9]{1,3}\.){3}[0-9]{1,3}"  # IP addresses

# Step 4: Identify open source components for known CVEs
# Extract version strings
strings firmware.bin | grep -Ei "version|v[0-9]+\.[0-9]+"

# Step 5: Network attack surface enumeration on the running device
nmap -sV -p- --open --script vuln TARGET_IP

# Step 6: MQTT assessment
# Test for anonymous connection (should fail)
mosquitto_sub -h TARGET_IP -p 1883 -t "#" --no-auth

# Test default credentials
mosquitto_sub -h TARGET_IP -p 1883 -t "#" -u admin -P admin

# Step 7: TLS configuration assessment
# Check supported cipher suites and TLS version
nmap --script ssl-enum-ciphers -p 8883 TARGET_IP

# Check certificate validity and pinning
openssl s_client -connect TARGET_IP:8883 -showcerts

# Step 8: Identify if JTAG is accessible on the PCB
# Check for 4-6 pin headers with 100mil spacing
# Use a JTAGulator or similar to identify pinout if unlabeled

The penetration test report documents each finding with: a description of the vulnerability, the exploitation scenario, the risk score using your likelihood-impact framework, the recommended mitigation, and whether a proof-of-concept exploit was demonstrated. The report becomes input to the threat register and the next sprint's security work backlog.

Secure Deployment Preparation

A device that is secure in development can become vulnerable through an improper deployment process. The deployment phase includes: the factory provisioning step that configures device-unique credentials and hardens the hardware, the production firmware build and signing process, the shipping configuration that the device arrives at the customer in, and the end-of-life process that ensures devices leaving service do not remain operational on the network.

Pre-Deployment Security Checklist

Verify every item on this checklist for each production hardware revision before authorising shipment:

  • All debug interfaces locked: JTAG/SWD at RDP Level 1 minimum; download mode disabled.
  • No test accounts, backdoor accounts or development credentials present in the production firmware build.
  • Secure boot enabled and verified: confirmed that a modified firmware binary is rejected at boot.
  • Flash encryption enabled where supported (ESP32, STM32L5).
  • Debug UART output disabled: no serial console output at production baud rate.
  • Unique per-device credentials provisioned: two random production units have different passwords.
  • Production firmware binary signed with the HSM-protected production signing key.
  • Anti-rollback counter set to the current firmware version.
  • MPU configured: bootloader region write-protected, stack regions enforced.
  • All unnecessary network services disabled: no open ports beyond the minimum required for device function.
  • OTA update endpoint requires mutual TLS and firmware signature verification.
  • Tamper detection circuit tested: verified that case opening triggers the expected response.
  • Device certificate provisioned with correct validity period and subject fields.
  • Watchdog timer enabled and verified: confirmed that halting the main loop triggers a reset.

The Device Provisioning Process

Device provisioning is the factory-floor process that transforms a generic hardware unit into a uniquely identified, securely configured production device. It is one of the most security-critical operations in the product lifecycle because the credentials and configuration loaded at this stage define the device's security properties for its entire operational life.

# Automated device provisioning script.
# Run on a dedicated provisioning station connected to the device via USB/JTAG.
# The provisioning station has access to the manufacturer CA's signing infrastructure
# (either an HSM directly connected or via an authenticated API to a cloud HSM).
# This script is illustrative; production provisioning uses a dedicated
# provisioning server with audit logging, access control and error recovery.

#!/bin/bash
set -euo pipefail

DEVICE_PORT="/dev/ttyUSB0"
PROVISIONING_LOG="provision_$(date +%Y%m%d_%H%M%S).log"

echo "=== Device Provisioning Script ===" | tee -a "$PROVISIONING_LOG"

# Step 1: Flash the production firmware binary
echo "[1/8] Flashing production firmware..." | tee -a "$PROVISIONING_LOG"
esptool.py --port "$DEVICE_PORT" write_flash 0x0 firmware_production_signed.bin
echo "  OK" | tee -a "$PROVISIONING_LOG"

# Step 2: Read the device's unique identifier (eFuse MAC address)
DEVICE_ID=$(esptool.py --port "$DEVICE_PORT" read_mac | grep "MAC:" | awk '{print $2}')
echo "[2/8] Device ID: $DEVICE_ID" | tee -a "$PROVISIONING_LOG"

# Step 3: Generate a device-unique strong password via the provisioning server
# The provisioning server uses a CSPRNG seeded from its HSM's entropy source
DEVICE_PASSWORD=$(curl -s --cert prov_client.crt --key prov_client.key \
  "https://provisioning.internal/api/v1/credentials/$DEVICE_ID/password")
echo "[3/8] Device password generated (not logged)" | tee -a "$PROVISIONING_LOG"

# Step 4: Generate the device keypair inside the secure element
# The secure element generates the keypair internally; the private key never leaves it
# Export only the public key for CSR generation
echo "[4/8] Generating keypair in secure element..." | tee -a "$PROVISIONING_LOG"
DEVICE_PUBKEY=$(python3 atca_provision.py --device "$DEVICE_PORT" generate_key --slot 0)
echo "  OK - Public key: ${DEVICE_PUBKEY:0:20}..." | tee -a "$PROVISIONING_LOG"

# Step 5: Submit CSR to manufacturer CA and receive signed certificate
echo "[5/8] Requesting device certificate..." | tee -a "$PROVISIONING_LOG"
DEVICE_CERT=$(curl -s -X POST \
  --cert prov_client.crt --key prov_client.key \
  -d "{\"device_id\": \"$DEVICE_ID\", \"public_key\": \"$DEVICE_PUBKEY\"}" \
  "https://ca.internal/api/v1/sign")
echo "  OK - Certificate issued" | tee -a "$PROVISIONING_LOG"

# Step 6: Load the certificate and password into device NVS
echo "[6/8] Loading credentials to device NVS..." | tee -a "$PROVISIONING_LOG"
python3 nvs_provision.py --port "$DEVICE_PORT" \
  --set device_cert "$DEVICE_CERT" \
  --set device_password "$DEVICE_PASSWORD"
echo "  OK" | tee -a "$PROVISIONING_LOG"

# Step 7: Burn security eFuses (irreversible - last step before final verification)
echo "[7/8] Burning security eFuses..." | tee -a "$PROVISIONING_LOG"
python -m espefuse --port "$DEVICE_PORT" burn_efuse JTAG_DISABLE 1
python -m espefuse --port "$DEVICE_PORT" burn_efuse DOWNLOAD_DIS 1
echo "  OK - JTAG and download mode permanently disabled" | tee -a "$PROVISIONING_LOG"

# Step 8: Register device in the cloud backend device registry
echo "[8/8] Registering device in cloud backend..." | tee -a "$PROVISIONING_LOG"
curl -s -X POST \
  --cert prov_client.crt --key prov_client.key \
  -d "{\"device_id\": \"$DEVICE_ID\", \"certificate\": \"$DEVICE_CERT\"}" \
  "https://api.internal/v1/devices/register"
echo "  OK - Device registered" | tee -a "$PROVISIONING_LOG"

echo "=== Provisioning Complete: $DEVICE_ID ===" | tee -a "$PROVISIONING_LOG"

Every provisioning run generates a signed audit log entry. The audit log records: the device ID, the firmware version installed, the certificate serial number issued, the timestamp, the provisioning station ID, and the provisioning operator's authenticated identity. This log is the evidence that a specific device was correctly provisioned, required during security incident investigation and compliance audit.

Security Documentation That Actually Gets Used

Security documentation serves three audiences with different needs: the development team (who need it to make correct implementation decisions), the operations team (who need it to deploy and maintain the device correctly) and auditors (who need it to verify that a claimed security property is implemented and tested). Documents that address only one audience are incomplete.

The Six Essential Documents

Security Architecture Document: Describes the system components, trust boundaries, authentication flows and encryption mechanisms in enough detail to allow a developer new to the project to understand why each security decision was made. Includes the DFD used during threat modeling, the STRIDE analysis table, and the mapping from security requirements to implementation mechanisms.

Threat Model: The living threat register with each identified threat, its STRIDE category, risk score, chosen response, implemented control and test reference. Updated at each major release and whenever a new vulnerability class is disclosed that applies to the system.

Security Requirements Specification: The verifiable security requirements from the requirements phase, each with its acceptance criterion and a reference to the test that verifies it. Includes the regulatory compliance matrix mapping each requirement to the specific regulatory clause it satisfies.

Test Results: The outputs of static analysis runs (with suppression justifications), fuzzing campaigns (seed corpus size, total executions, findings), penetration test reports and regression test results for each release. Provides evidence that security controls were tested and were passing at release time.

Secure Configuration Guide: The specific settings, option byte values, eFuse configuration and network firewall rules that constitute a correctly secured device. Should be specific enough that a factory technician can configure a device from it without interpretation. Includes the pre-deployment checklist in a form suitable for use as a quality control step at the factory.

Incident Response Plan: The documented procedure for responding to a security incident involving the device: who to notify, how to quarantine affected devices (credential revocation, network block), how to investigate the root cause, how to deploy an emergency patch, and the vulnerability disclosure process for publicly reporting discovered vulnerabilities and their mitigations. Having this plan written before an incident is the difference between a managed response and a reactive crisis.

Security in Code Comments

In addition to the formal documents, security-relevant decisions in code must be explained in comments at the point of implementation. Future developers who do not have the context of the original security review will make changes that seem logically correct but violate a security assumption. A comment that explains the assumption prevents this:

/* SECURITY: Constant-time comparison required here.
   Standard memcmp() short-circuits on the first mismatched byte,
   which leaks information about how many leading bytes of the
   correct hash the caller's input matched. This is exploitable
   as a timing oracle to guess the correct hash one byte at a time.
   Do NOT replace this with memcmp() even if it appears equivalent. */
if (!constant_time_memcmp(computed_hash, expected_hash, SHA256_DIGEST_LENGTH)) {
    reject_firmware_image();
}

/* SECURITY: This buffer must be zeroed before freeing.
   It contains the session encryption key. If not explicitly zeroed,
   the compiler may eliminate the memset as a dead store.
   Use secure_zero(), not memset(), to prevent this optimisation. */
secure_zero(session_key_buffer, sizeof(session_key_buffer));
free(session_key_buffer);
session_key_buffer = NULL;

Conclusion

A secure SDLC for embedded systems is what separates a device that ships with known, documented and tested security properties from one that ships with unknown vulnerabilities waiting to be found by whoever gets to it first. The work in this article, writing verifiable security requirements rather than aspirational ones, applying STRIDE to every trust boundary in the system architecture, running structured code reviews with a checklist that covers the categories where embedded vulnerabilities cluster, fuzzing protocol parsers before shipping them, hardening deployment through an automated provisioning process and maintaining the documentation that makes all of these controls auditable, does not add uncontrolled scope to a project. It replaces the uncontrolled scope of responding to vulnerabilities after devices are deployed. The choice is between spending the time during development when the cost of change is manageable, or spending it after deployment when the cost can be catastrophic.

Leave a reply

Loading Next Post...
Follow
Search Trending
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...