Advanced Security Testing

Deep security assurance suite for wallet internals. This page documents how the advanced tests work, what classes of vulnerabilities they target, and how to interpret their findings.

Overview

The advanced security testing suite builds on the core unit, integration, cryptographic, and fuzz tests. It focuses on protocol correctness under adversarial conditions and implementation robustness, using several complementary techniques:

Differential fuzzing for mnemonic and amount parsing consistency
Mutation testing for address and transaction input validation
Chaos engineering for rate limiting and memory pressure
Adversarial testing for injection, timing, and buffer abuse
Timing attack detection for sensitive parsing operations
Memory safety testing for leak and corruption detection
Side-channel analysis for error message and timing leakage
Cryptographic differential testing for BIP39 and secret parsing

Running the Advanced Suite

# From keythings-monorepo root
bun run security:advanced

# Or run the suite directly:
bun tests/security/suites/advanced-security-testing-suite.ts

# Example output:
[SECURITY] Starting Advanced Security Testing Suite...
[SECURITY] Running Differential Fuzzing...
[SECURITY] Running Mutation Testing...
[SECURITY] Running Chaos Engineering Tests...
[SECURITY] Running Adversarial Testing...
[SECURITY] Running Timing Attack Detection...
[SECURITY] Running Memory Safety Testing...
[SECURITY] Running Side-Channel Analysis...
[SECURITY] Running Cryptographic Differential Testing...
[SECURITY] Security Testing Results:
Total Findings: 0
Critical: 0
High: 0
Medium: 0
Low: 0
Info: 0

The suite prints a structured JSON report at the end with per-test-type counts and detailed findings. CI and auditors treat critical and high findings as blockers; mediumfindings are used to drive performance and resilience improvements.

1. Differential Fuzzing

Differential fuzzing compares multiple implementations or representations of the same logical value to catch subtle inconsistencies that could lead to silent fund mis-accounting orseed corruption.

Mnemonic Parsing Consistency

The suite generates random BIP39 entropies, derives mnemonics, and then feeds them intoparseSecretInput in different normalised forms (original, lowercased, uppercased):

await fc.assert(fc.asyncProperty(entropyArb, async (entropy) => {
  const mnemonic = entropyToMnemonic(entropy, english)

  try {
    const parsed1 = await parseSecretInput(mnemonic)
    const parsed2 = await parseSecretInput(mnemonic.toLowerCase())
    const parsed3 = await parseSecretInput(mnemonic.toUpperCase())

    assert.deepEqual(parsed1.bytes, parsed2.bytes)
    assert.deepEqual(parsed1.bytes, parsed3.bytes)

    if (parsed1.mnemonic) {
      const roundTrip = mnemonicToEntropy(parsed1.mnemonic, english)
      assert.deepEqual(Array.from(entropy), Array.from(roundTrip))
    }
  } catch (error) {
    if (error instanceof Error && error.message === 'INVALID_MNEMONIC') {
      // Expected: Keeta SDK rejects weak or non-standard mnemonics
      return true
    }
    throw error
  }
}), { numRuns: 100 })

Issues this test is designed to detect:

Inconsistent parsing of mnemonics (e.g. case-sensitive behaviour) that could derive different keys for the same phrase.
Lossy round-trips where mnemonicToEntropy does not recover the original entropy.

Severity: Any divergence in derived keys is treated as critical, as it can make restores non-deterministic or send funds to the wrong key. Unexpected INVALID_MNEMONICresults on valid inputs are high severity.

Amount Parsing Consistency

For token amounts, the suite compares human-readable decimal strings to the internal integer representation produced by parseAndValidateAmount. It ensures that padded and unpadded fractional strings either both fail validation or both produce the same integer value:

fc.assert(fc.property(amountArb, ({ whole, decimals, fractional }) => {
  const wholeString = whole.toString()
  const amountString = fractional.length > 0
    ? `${wholeString}.${fractional}`
    : wholeString
  const paddedFractional = decimals === 0 ? '' : fractional.padEnd(decimals, '0')
  const paddedAmountString = decimals === 0
    ? wholeString
    : `${wholeString}.${paddedFractional}`

  if (fractional.length > decimals) {
    assert.throws(() => parseAndValidateAmount(amountString, decimals))
    if (decimals > 0) {
      assert.throws(() => parseAndValidateAmount(paddedAmountString, decimals))
    }
    return
  }

  // Values that overflow MAX_SAFE_AMOUNT are expected to throw
  // otherwise padded/unpadded forms must match exactly.
  const result1 = parseAndValidateAmount(amountString, decimals)
  const result2 = parseAndValidateAmount(paddedAmountString, decimals)
  assert.equal(result1, result2)
}), { numRuns: 200 })

Issues this test is designed to detect:

Rounding or truncation bugs that mis-price orders or display balances inconsistent with on-chain amounts.
Overflow / underflow when amounts exceed the configured MAX_SAFE_AMOUNT.

Severity: Incorrect amount conversions are treated as high tocritical depending on whether they could move value incorrectly or only affect display.

2. Mutation Testing

Mutation testing systematically corrupts otherwise valid inputs to ensure that validation logic rejects them. The suite focuses on validateKeetaAddress and validateTransactionData.

for (const address of validAddresses) {
  for (let i = 0; i < address.length; i++) {
    const chars = '0123456789abcdefABCDEF!@#$%^&*()'
    for (const char of chars) {
      if (char === address[i]) continue
      const mutated = address.slice(0, i) + char + address.slice(i + 1)
      try {
        validateKeetaAddress(mutated)
        findings.push({
          testType: 'mutation',
          target: 'validateKeetaAddress',
          observation: 'Mutated address passed validation',
          severity: 'high',
        })
      } catch {
        // Expected to fail
      }
    }
  }
}

Issues this test is designed to detect:

Lenient address validation that accepts one-character mutations or malformed hex strings.
Transaction data that accepts obviously invalid patterns (all zeros, all ones) without additional business rules.

Severity: Address validation weaknesses are high severity, as they can be used in phishing or misdirection attacks where a visually similar but incorrect address is accepted.

3. Chaos Engineering & Rate Limiting

Chaos tests push the RateLimiter and the runtime memory allocator under stress to ensure that abuse does not degrade performance or bypass protections.

const limiter = new RateLimiter({ windowMs: 1000, maxRequests: 5 })
for (let i = 0; i < 5; i++) limiter.checkLimit(origin)

let exceptions = 0
for (let i = 0; i < 100; i++) {
  try { limiter.checkLimit(origin) } catch { exceptions++ }
}

if (exceptions < 95) {
  findings.push({
    testType: 'chaos',
    target: 'RateLimiter',
    observation: 'Too many requests slipped through',
    severity: 'high',
  })
}

Issues this test is designed to detect: Session or origin rate limits that can be bypassed by burst traffic, enabling brute-force or credential stuffing attacks.

Severity: Rate-limit bypass is treated as high severity. Excessive CPU time under load is recorded as medium for performance hardening.

4. Timing Attack Detection & Side-Channel Analysis

The suite measures execution time for parseAndValidateAmount, parseSecretInput, and address validation across valid and invalid inputs. It flags large timing differences or divergent error messages that could act as side channels.

const timings = testAmounts.map(amount => measureOperation(() => {
  try { parseAndValidateAmount(amount, 18) } catch {}
}, amount))

const variation = Math.max(...timings) - Math.min(...timings)
if (variation > 0.5) {
  findings.push({
    testType: 'timing',
    target: 'parseAndValidateAmount',
    observation: `Timing variation: ${variation}ms`,
    severity: 'low',
  })
}

Vulnerabilities detected:

Timing side channels that distinguish valid vs invalid secrets or amounts.
Error messages that leak too much information about why validation failed.

Severity: Timing differences are generally low to mediumseverity unless they can be proven exploitable at scale. Divergent error messages that reveal secret structure are at least medium.

5. Memory Safety Testing

Memory safety tests look for unbounded heap growth and large allocations in response to untrusted input. The suite exercises parseAndValidateAmount thousands of times and records heap snapshots before and after.

const initial = process.memoryUsage()
for (let i = 0; i < 1000; i++) {
  try { parseAndValidateAmount(`${i}.${i % 100}`, 2) } catch {}
}
if (global.gc) global.gc()
const final = process.memoryUsage()
const growth = ((final.heapUsed - initial.heapUsed) / initial.heapUsed) * 100
if (growth > 20) {
  findings.push({
    testType: 'memory',
    target: 'parseAndValidateAmount',
    observation: `Memory growth: ${growth.toFixed(2)}%`,
    severity: 'medium',
  })
}

Issues this test is designed to detect: Potential memory leaks or pathological allocations driven by malformed numeric input.

Severity: Marked as medium, since leaks can lead to denial-of-service or degraded user experience.

6. Cryptographic Differential Testing

In addition to the dedicated Cryptographic Security Testing page, the advanced suite performs differential checks against BIP39 operations using the same primitives the wallet relies on in production.

Vulnerabilities detected:

Non-deterministic mnemonic generation for a fixed entropy buffer.
Lossy round-trips between entropy and mnemonic representations.
Inconsistencies between parseSecretInput and the underlying BIP39 implementation.

Severity: Any discrepancy in key material is critical, since it can result in irretrievable funds.

Vulnerability Classes & Severity Mapping

Technique	Primary Vulnerabilities	Typical Severity
Differential fuzzing (mnemonics)	Inconsistent key derivation, lossy entropy round-trip	High / Critical
Differential fuzzing (amounts)	Rounding errors, overflow, mis-accounting	High / Critical
Mutation testing	Weak address/transaction validation	High
Chaos engineering	Rate-limit bypass, performance collapse under load	Medium / High
Timing & side-channel tests	Timing channels, verbose error messages	Low / Medium
Memory safety tests	Long-lived heap growth, large allocations	Medium
Crypto differential tests	Non-deterministic or lossy crypto flows	Critical

Research & References

The design of the advanced suite is informed by industry best practices and real-world incidents:

Input Validation & Numeric Parsing: The OWASP Input Validation Cheat Sheet recommends strict allowlisting and defensive parsing for financial data to avoid rounding and overflow issues.
Timing & Side-Channel Attacks: OWASP highlights timing-based attacks in its Cryptographic Failures category, where small timing differences can leak key material or validity of secrets.
Wallet & Extension Incidents: Public reports on malicious browser extensions stealing seed phrases (for example, The Hacker News coverage of 49 malicious Chrome crypto extensions) underscore the need for strict internal validation, non-custodial design, and hardening against side-channel leakage.

Together with the cryptographic, fuzz, and UI/UX suites, the advanced tests provide a defense-in-depth view of wallet safety: they do not just test that features work, but that they fail safely and predictably under hostile conditions.

← Previous:Security for Developers

17 of 27

Cryptographic Security TestingNext →