Bonus 8 — Database Security & Compliance

“Performance bug = customer chậm. Security bug = customer leak ra dark web. Tuần này dạy bạn defense layers: encryption, access control, audit, compliance.”

Tags: database security compliance encryption gdpr postgres Thời lượng: 4-5 ngày Prerequisites: Tuan-07-Backup-PITR-DR Liên quan: Tuan-14-AuthN-AuthZ-Security (SD course) · Tuan-15-Data-Security-Encryption (SD course)


1. Threat Model

graph TB
    subgraph "Threats"
        T1[SQL injection]
        T2[Privilege escalation]
        T3[Data exfiltration]
        T4[Backup theft]
        T5[Disk loss / theft]
        T6[Insider threat]
        T7[Ransomware]
        T8[Cross-tenant leak]
    end

    subgraph "Defenses"
        D1[Parameterized queries]
        D2[Least privilege role]
        D3[Network isolation, VPC]
        D4[Encrypted backups + KMS]
        D5[Disk encryption]
        D6[Audit logging]
        D7[Immutable backups]
        D8[RLS + tested isolation]
    end

    T1 -.fix.-> D1
    T2 -.fix.-> D2
    T3 -.fix.-> D3
    T4 -.fix.-> D4
    T5 -.fix.-> D5
    T6 -.fix.-> D6
    T7 -.fix.-> D7
    T8 -.fix.-> D8

2. Encryption Layers

2.1 Encryption in Transit

-- postgresql.conf
ssl = on
ssl_cert_file = '/path/server.crt'
ssl_key_file = '/path/server.key'
 
-- pg_hba.conf - require SSL
hostssl all all 0.0.0.0/0 scram-sha-256

Client:

postgresql://user:pass@host/db?sslmode=verify-full&sslrootcert=/path/ca.crt

sslmode:

  • disable — no encryption
  • prefer — try SSL, fallback (vulnerable)
  • require — require SSL, no cert verify
  • verify-ca — verify CA
  • verify-full — verify CA + hostname (default for production)

2.2 Encryption at Rest

Disk-level (Linux):

  • LUKS / dm-crypt — Linux native
  • AWS EBS encryption
  • GCP persistent disk encryption

Filesystem-level:

  • ZFS encryption
  • ext4 encryption

DB-level (TDE):

  • Oracle, SQL Server: built-in TDE
  • Postgres core: không có TDE (đến PG17/18). Có trong fork: EDB Advanced Server, Cybertec patches, Percona Postgres.
  • Real-world Postgres: kết hợp disk-level encryption (LUKS/EBS) + column-level cho PII via pgcrypto

2.3 Column-level encryption

For PII columns specifically:

CREATE EXTENSION pgcrypto;
 
-- Encrypt
INSERT INTO users (email, ssn_encrypted)
VALUES (
    '[email protected]',
    pgp_sym_encrypt('123-45-6789', current_setting('app.encryption_key'))
);
 
-- Decrypt
SELECT email, pgp_sym_decrypt(ssn_encrypted::bytea, current_setting('app.encryption_key'))
FROM users WHERE id = 1;

⚠️ Issue: indexes on encrypted columns don’t work for range queries.

2.4 Queryable encryption (MongoDB 6.0+, partial)

Some operations on encrypted data without decryption. Niche, expensive.

2.5 Key management

graph LR
    DB --> KMS[Key Management System<br/>AWS KMS, Vault, GCP KMS]
    KMS --> HSM[HSM hardware]
    DB -.encryption keys.-> Backup[Backup]

Never store key in same place as encrypted data. Use KMS service.


3. Access Control

3.1 Postgres role hierarchy

-- Don't use postgres superuser for app!
CREATE ROLE app_user LOGIN PASSWORD '...';
CREATE ROLE app_readonly;
CREATE ROLE app_admin;
 
-- Grant
GRANT CONNECT ON DATABASE appdb TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO app_user;
 
-- Future tables (PG10+)
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_user;

3.2 Least privilege per service

-- Service A only reads
CREATE ROLE service_a_user IN GROUP app_readonly LOGIN;
 
-- Service B writes to orders only
CREATE ROLE service_b_user LOGIN;
GRANT INSERT, UPDATE ON orders TO service_b_user;
GRANT SELECT ON orders, users TO service_b_user;

3.3 No DROP / TRUNCATE for app

-- App user CANNOT drop tables
-- Even if compromised, attacker can't wipe data
REVOKE TRUNCATE ON ALL TABLES FROM app_user;
REVOKE DROP ON ALL TABLES FROM app_user;

3.4 ALTER DEFAULT PRIVILEGES

ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO app_readonly;
-- Future tables auto-grant to app_readonly

3.5 Row Level Security (RLS) — recap

ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.current_tenant_id')::bigint);

Tuần 16 deep dive.


4. Audit Logging

4.1 Postgres audit

# postgresql.conf
log_connections = on
log_disconnections = on
log_statement = 'ddl'           # log DDL only by default; 'all' for everything but spammy
log_min_duration_statement = -1  # only slow queries normally

For full audit, use pgaudit extension:

shared_preload_libraries = 'pgaudit'
pgaudit.log = 'write, ddl'      # log all writes and DDL
pgaudit.log_relation = on        # include table info

pgaudit.log = 'all' for compliance modes (CIS, HIPAA).

4.2 Application-level audit

Trigger pattern (Tuần 02):

CREATE TABLE audit_log (
    id bigserial PRIMARY KEY,
    user_id bigint,
    actor_id bigint,
    action text,
    target_table text,
    target_id bigint,
    diff jsonb,
    occurred_at timestamptz DEFAULT now()
);
 
CREATE TRIGGER trg_audit_users
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW EXECUTE FUNCTION audit_trigger();

App sets app.current_user_id before each request:

SET LOCAL app.current_user_id = '42';

4.3 CDC to external audit

Debezium → Kafka → Audit DB (separate, write-once). Better separation.

4.4 What to audit

Always:

  • Authentication (success + fail)
  • Authorization changes (role grants)
  • Schema changes (DDL)
  • Data exports (large SELECT for analyst)
  • Admin actions

Application-level:

  • PII access
  • Financial transactions
  • Permission changes

5. SQL Injection Prevention

5.1 ALWAYS parameterized queries

# BAD
query = f"SELECT * FROM users WHERE id = {user_id}"  # injection!
 
# GOOD
query = "SELECT * FROM users WHERE id = $1"
result = conn.execute(query, [user_id])

ORM use parameterized internally → safer by default.

5.2 Cases where prep cannot use parameters

  • Table/column names (use whitelist)
  • ORDER BY direction
  • LIMIT/OFFSET (most drivers OK)
# Whitelist column for sort
SORTABLE = {'name', 'created_at'}
sort_col = request.args.get('sort')
if sort_col not in SORTABLE:
    raise ValueError("invalid sort")
query = f"SELECT * FROM users ORDER BY {sort_col}"

5.3 Dangerous functions

dblink, pg_read_file, lo_import — dangerous if exposed. Restrict to superuser.


6. PII & GDPR

6.1 PII classification

Tag schema with PII level:

COMMENT ON COLUMN users.email IS 'PII:contact';
COMMENT ON COLUMN users.ssn IS 'PII:sensitive';

Tools: pgmask, datafold for scanning.

6.2 Pseudonymization

Replace direct identifiers with token:

UPDATE users SET email = 'user_' || id || '@anonymized.local' WHERE deleted_at IS NOT NULL;

6.3 GDPR Right to Erasure

User requests deletion. Steps:

  1. Delete or anonymize user record
  2. Cascade to associated data (orders → keep but anonymize user_id?)
  3. Audit log: keep record of deletion request
  4. Backups: cannot delete from old immutable backups, but rotation eventually
  5. Analytics/derived: re-process or accept residual
-- Anonymize, don't delete (preserve referential integrity)
UPDATE users SET
    email = 'deleted_' || id,
    name = 'Deleted User',
    deleted_at = now()
WHERE id = $1;

6.4 Data subject access request

User wants their data. Export per-user dump:

COPY (
    SELECT * FROM users WHERE id = $1
) TO STDOUT WITH CSV HEADER;
 
COPY (
    SELECT * FROM orders WHERE user_id = $1
) TO STDOUT WITH CSV HEADER;
-- ... all related tables

Build endpoint or admin tool.

6.5 Data residency

Some users must be in specific region (EU users in EU).

Solutions:

  • Multi-region setup with routing
  • Per-tenant DB in specific region
  • Postgres FDW for cross-region access (rare, slow)

7. PCI-DSS for Card Data

Don’t store card data in your DB if avoidable. Use Stripe/payment provider tokens.

If you must:

  • Encrypt at column level (PCI requirement)
  • Annual penetration test
  • Separate cluster, restricted network
  • Audit every access
  • ~$10K/year compliance overhead

8. HIPAA (Healthcare US)

PHI (Protected Health Information):

  • Encrypt at rest + in transit (mandatory)
  • Audit every access
  • BAA (Business Associate Agreement) with cloud provider
  • Access logs retained 6 years

Postgres on AWS RDS: HIPAA-eligible. Sign BAA with AWS, configure encryption, enable CloudTrail.


9. Backup Security

Tuần 07 covered. Recap:

  • Encryption at rest (KMS-backed)
  • Encryption in transit
  • Separate credentials (write-only for backup user)
  • Immutable storage (S3 Object Lock)
  • Air-gapped copy (rotating offline)
  • MFA delete

10. Database User Lifecycle

10.1 New employee onboarding

CREATE ROLE alice_engineer LOGIN PASSWORD '<temp>';
GRANT app_readonly TO alice_engineer;
ALTER ROLE alice_engineer VALID UNTIL '2027-05-16';  -- annual renewal
ALTER ROLE alice_engineer PASSWORD '<actual>';      -- alice changes

10.2 Departing employee

REVOKE ALL FROM alice_engineer;
DROP ROLE alice_engineer;
-- Audit log retained

10.3 Service accounts

CREATE ROLE service_orders LOGIN PASSWORD '<secret>';
-- Specific privileges
GRANT app_orders_role TO service_orders;
-- Rotate every 90 days

Rotate via Vault or AWS Secrets Manager dynamic credentials.


11. Anti-patterns

PatternWhy bad
App connects as superuserCompromise = wipe
Hard-coded password in codeRepo compromise
Same password for all envsDev compromise → prod compromise
No SSLMITM, sniff
No backup encryptionBackup theft = data leak
Audit log in same DB onlyAttacker deletes
GDPR-style deletes from backupImpossible, rotation only
Role with grant option for allPrivilege escalation
No password rotationStale credentials
Open postgres port to internetMass scan exploit

12. Lab

Day 1: SSL setup

Postgres with self-signed cert. Connect verify-full. Test sniff (Wireshark) before/after.

Day 2: Roles

Build 4 roles (admin, readwrite, readonly, audit). Test each can/cannot.

Day 3: pgaudit

Enable, generate workload, observe logs.

Day 4: RLS test

Setup multi-tenant. Test cross-tenant isolation. Write CI test for it.

Day 5: GDPR exercise

Implement export + delete-by-request endpoints. Test thoroughly.


13. Tiếp theo

Bonus chapters complete. Tiếp theo: Case studies (5 designs).

Cập nhật: 2026-05-16