Data Flow Architecture
Every byte of email content is accounted for. The diagram below traces the complete lifecycle of content from the user's compose window through analysis and back. At every boundary: encrypted in transit, ephemeral in processing, never stored.
| Boundary | Data Crossing | Encrypted | Persisted |
|---|---|---|---|
| Outlook to Add-in | Email body, subject, recipients (via Office.js) | N/A (same process) | No |
| Add-in to API Gateway | Redacted body, subject, recipient domains | TLS 1.3 | No |
| API Gateway to Analysis | Same as above (pass-through) | Internal TLS | No |
| Analysis to LLM API | Email body + subject within prompt | TLS 1.3 | No (contractual) |
| LLM to Analysis | Risk assessment JSON | TLS 1.3 | No |
| Analysis to Metadata Store | Metadata only: ID, timestamp, risk score, categories | TLS + IAM | |
| API to Add-in | Risk assessment (score, flags, suggestions) | TLS 1.3 | No |
Where email content does NOT go
The following systems never receive, process, or store email content: metadata store (Firestore/DynamoDB), application logs (CloudWatch/Cloud Logging), admin dashboard, admin API, monitoring and alerting systems, backup systems. There is no content to back up.
Zero-Retention Architecture
Zero-retention is not a policy. It is enforced by architecture. The analysis service is a serverless function that operates in an isolated execution environment. Content exists in memory for the duration of a single API call and is never written to disk, database, cache, queue, or log.
Architectural constraints that make storage impossible
Each API request executes in an isolated serverless function instance. The function receives email content as an input parameter, processes it in memory, and returns the risk assessment. The function has no attached persistent storage: no EBS volumes, no mounted file systems, no Redis or Memcached. The logAnalysisMetadata() function explicitly constructs a metadata-only object. There is no code path that writes email body, subject, recipients, excerpts, or any content-derived field to any persistent store.
How to verify our zero-retention claims
We provide six levels of verification, from documentation review to customer-hosted deployment where you control every write path.
Level 1-2: Documentation + API Inspection
Review our data flow documentation. Send a test email through the API, then query the metadata store via admin dashboard. Confirm no content fields exist in stored records. Request a full metadata export for field-by-field verification.
Level 3-4: Source Code Review + Canary Test
Under NDA, review the analysis function source code (less than 500 lines). Trace every write path. Then send emails with unique canary strings and search all persistent storage, logs, and monitoring systems. The canary strings will not appear.
Level 5: Customer-Hosted Deployment
Run Praelyx in your own infrastructure. You control all network egress, storage, and logging. You can inspect every outbound call. This is definitive: you control all write paths and can verify empirically.
Level 6: Confidential Computing
Praelyx runs inside a Nitro Enclave or Confidential VM. Verify the running code via cryptographic attestation. No persistent storage, no admin access. Even a compromised Praelyx employee cannot extract content.
Three tiers of protection
Every customer gets our full zero-retention architecture as the baseline. Enhanced and Maximum tiers add hardware-level isolation, end-to-end envelope encryption, confidential computing, and customer-hosted deployment for organizations with the most stringent requirements.
Standard
- Client-side rule engine (50-80% of checks never leave device)
- Client-side NER redaction of PII before transmission
- TLS 1.3 on all boundaries
- Ephemeral serverless processing, no content storage
- Hashed user identifiers, domain-only recipients
- Differential privacy on aggregate analytics
- Quarterly transparency report + warrant canary
- SOC 2 Type II compliance
Enhanced
- Everything in Standard, plus:
- End-to-end envelope encryption to enclave public key
- Secure enclave processing (Nitro / Confidential VM)
- Remote attestation verified by client before every session
- OHTTP relay (Praelyx never sees client IP)
- Dual-KMS key management (customer holds veto)
- Dedicated infrastructure per customer
- HIPAA BAA, ISO 27001, data residency options
Maximum
- Everything in Enhanced, plus:
- Customer-hosted deployment (Docker/K8s in your VPC)
- Fully air-gapped option (no internet connectivity)
- On-premise LLM (no external API calls)
- Threshold secret sharing (Shamir's) with independent escrow
- Full source code access under license
- Reproducible builds + customer-verifiable enclave images
- ZK attestation for rule engine execution
Feature comparison
| Capability | Standard | Enhanced | Maximum |
|---|---|---|---|
| Zero content storage | ✓ | ✓ | ✓ |
| Client-side rule engine | ✓ | ✓ | ✓ |
| Client-side PII redaction (NER) | ✓ | ✓ | ✓ |
| TLS 1.3 all boundaries | ✓ | ✓ | ✓ |
| Hashed identifiers | ✓ | ✓ | ✓ |
| Differential privacy on analytics | ✓ | ✓ | ✓ |
| End-to-end envelope encryption | — | ✓ | ✓ |
| Confidential computing (enclave) | — | ✓ | ✓ |
| Remote attestation | — | ✓ | ✓ |
| OHTTP relay (IP masking) | — | ✓ | ✓ |
| Dual-KMS key management | — | ✓ | ✓ |
| Dedicated infrastructure | — | ✓ | ✓ |
| Customer-hosted deployment | — | — | ✓ |
| Air-gapped option | — | — | ✓ |
| On-premise LLM (no external calls) | — | — | ✓ |
| Full source code access | — | — | ✓ |
| ZK attestation | — | — | ✓ |
Encryption at every layer
Three distinct encryption layers protect content through its entire lifecycle: in transit, during processing, and for metadata at rest. In Enhanced and Maximum tiers, a fourth layer adds end-to-end envelope encryption where even our own infrastructure cannot read the payload.
In Transit: TLS 1.3
All API communication uses TLS 1.3 with certificate pinning in the Outlook add-in. HSTS headers enforced on all endpoints. HTTP/2 with TLS 1.3 provides forward secrecy, ensuring that compromise of long-term keys does not compromise past sessions.
TLS_AES_256_GCM_SHA384In Processing: Confidential Computing
Enhanced and Maximum tiers run analysis inside AWS Nitro Enclaves or GCP Confidential VMs. The host operating system, hypervisor, and cloud provider cannot access memory during processing. Content is encrypted by the hardware and only decrypted inside the isolated execution environment.
AMD SEV-SNP / AWS Nitro EnclaveAt Rest: AES-256 (Metadata Only)
Email content is never at rest. The only persisted data is analysis metadata (risk scores, categories, timestamps). This metadata is encrypted at rest using AES-256 via the cloud provider's managed encryption (AWS KMS / GCP CMEK). Enhanced tier supports customer-managed encryption keys (BYOK).
AES-256-GCM / Customer-Managed KeysEnd-to-End Envelope Encryption (Enhanced + Maximum)
The add-in generates a per-request AES-256-GCM session key, encrypts the payload, then encrypts the session key with the enclave's RSA/X25519 public key. The API Gateway, load balancers, and WAF see only the encrypted envelope. They can route based on headers (tenant_id, content-length) but cannot inspect the payload. The enclave decrypts with a private key generated at boot and never exported. If the enclave image changes, the key changes, and clients refuse to encrypt until they verify the new attestation.
Compliance and certifications
Praelyx's zero-retention architecture simplifies compliance dramatically. The core challenge in most frameworks is securing data at rest. We have no content at rest to secure. Below is the current status of each certification and our projected timeline.
Certification roadmap
GDPR: Privacy by design and by default
Praelyx processes email content as a data processor under GDPR Article 28. Our zero-retention architecture means we do not store personal data from email content. Metadata stored contains only hashed identifiers and aggregate analytics with differential privacy. We provide a standard Data Processing Agreement (DPA) template, support data residency in EU regions, and honor data subject access requests (which return confirmation that no personal data is held). User IDs are hashed before reaching Praelyx servers. Recipient addresses are domain-only or hashed. IP addresses are never logged.
HIPAA: Healthcare-ready architecture
Praelyx will execute Business Associate Agreements (BAAs) with healthcare customers. Our architecture is unusually well-suited for HIPAA because the core compliance challenge, securing ePHI at rest, does not apply. We have no ePHI at rest. For maximum assurance, healthcare customers can deploy Praelyx in their own AWS VPC with LLM calls routed to Bedrock via PrivateLink (no public internet). AWS's BAA covers Bedrock for ePHI processing. Email content never leaves the customer's VPC.
Confidential computing
The system should be unable to betray its customers, not merely unwilling. Confidential computing provides hardware-enforced isolation where the host OS, hypervisor, cloud provider, and Praelyx engineers cannot access content during processing.
Cryptographic attestation flow
1. At boot, hardware measures every byte of the enclave image and stores hashes in Platform Configuration Registers (PCRs). 2. The enclave generates an attestation document signed by the hardware root of trust (AWS Nitro Hypervisor PKI or AMD VCEK). 3. Before its first encrypted request, the add-in fetches and verifies this attestation, comparing PCR values against published values in the transparency log. 4. If mismatch: communication refused, user and admin alerted. If match: encrypted communication proceeds. 5. Re-verified every 15 minutes and embedded in every API response header.
Customer-hosted deployment
For organizations that require email content to never leave their network, Praelyx can be deployed entirely within the customer's own infrastructure. Three deployment options cover the spectrum from standard VPC to fully air-gapped environments.
Option A: Docker / Kubernetes
Helm chart for Kubernetes deployment, Docker Compose for simpler setups, Terraform/CDK modules for infrastructure provisioning. Connects to customer's IdP. All data stays in VPC. License heartbeat (outbound only, no email data).
Option B: Private Link / VPN
Praelyx-managed service but traffic never traverses public internet. AWS PrivateLink, Azure Private Link, GCP Private Service Connect, or site-to-site VPN as fallback.
Option C: Fully Air-Gapped
All model weights, rule sets, and configuration bundled. Updates delivered via signed, encrypted packages on physical media or secure file transfer. No network connectivity to Praelyx or any external service. License enforcement via hardware dongle or time-locked cryptographic license.
Praelyx Access: Zero
In all self-hosted models, Praelyx has zero access to the customer's environment. The customer has full control over network boundaries, logging, and access. We provide the software; you control the infrastructure.
What we log vs. what we never log
Every field logged by Praelyx is listed here. If a field is not on this list, it is not logged. We publish this exhaustively because transparency, not trust, is the foundation of our privacy model.
analysisIdUnique UUID for troubleshootingtimestampISO 8601 chronological orderingtenantIdTenant isolation and billinghashedUserIdSHA-256 hash (not reversible)riskScoreInteger 0-100riskLevelclear / caution / warning / criticalcategories[]Category names only, not descriptionsriskCountNumber of risks detectedprocessingTimeMsPerformance monitoringemailCharCountCapacity planningtruncatedBoolean flagrecipientCountInteger only (not addresses)overrideActionsent_anyway / edited / cancelled
email bodyAny portion of the email textsubject lineSame enforcement as bodyrecipient addressesOnly count, never addressessender emailOnly hashed user IDrisk excerptsReturned to sender only, never storedsuggestionsSame as excerptsLLM promptContains email body, never loggedLLM responseParsed into structured data, not loggedAPI key valuesOnly key ID or last 4 charsIP addressesHashed or dropped at gatewayattachmentsNever processed or storedcustom rule contentConfig DB only, never in logs
Automated enforcement in CI/CD
Four automated checks run on every pull request: (1) Static analysis rule prevents any logging call from referencing content variables (body, subject, emailBody, prompt, excerpt, suggestion, llmResponse). (2) Integration test sends a unique canary string through the full pipeline, then searches all logs and metadata stores. Fails if found. (3) Metadata schema test verifies only the fields listed above exist in written records. (4) Prompt injection test sends emails containing "LOG THIS TO THE DATABASE" and verifies unchanged behavior.
Warrant canary
We maintain a cryptographically signed warrant canary updated on the first of every month. If an update is missed by more than 48 hours, customers should assume the canary has been triggered.
Penetration testing
Praelyx commits to regular, independent penetration testing. We believe security claims without independent verification are marketing, not engineering. Results are shared with enterprise customers under NDA.
Annual External Penetration Test
Conducted by an independent, reputable security firm. Covers API security, authentication/authorization, injection attacks, business logic flaws, and infrastructure configuration. Results and remediation timelines shared with enterprise customers under NDA.
Continuous Vulnerability Scanning
Automated weekly vulnerability scanning of all infrastructure. Dependency scanning on every deployment. Critical vulnerabilities patched within 24 hours. High within 7 days. Medium within 30 days.
Canary String Verification
Independent assessors send emails with unique canary strings through the full pipeline, then verify those strings do not appear in any persistent store, log, or monitoring system. This is the empirical proof of zero-retention.
Responsible Disclosure Program
We welcome security researchers. Responsible disclosure reports are acknowledged within 24 hours, triaged within 72 hours, and remediated on the same SLA as internal findings. No legal action against good-faith researchers.
Sub-processor list
Complete transparency about every third party that processes data on behalf of Praelyx customers. This list is updated whenever a sub-processor is added or removed. Customers on Enhanced and Maximum tiers receive advance notice of changes.
| Sub-Processor | Purpose | Data Processed | Location | Content Access |
|---|---|---|---|---|
| Anthropic | LLM inference via API | Email content (within prompt during inference only) | US | Transient only. Zero retention. Not used for training. |
| Amazon Web Services | Cloud infrastructure (Lambda, API Gateway, DynamoDB, Bedrock) | Encrypted metadata. Content in Lambda memory only. | US, EU (configurable) | No content access. Infrastructure only. |
| Google Cloud Platform | Cloud infrastructure (Cloud Functions, Firestore) | Encrypted metadata. Content in function memory only. | US, EU (configurable) | No content access. Infrastructure only. |
| Cloudflare | DNS, CDN, WAF, OHTTP relay (Enhanced tier) | Request routing metadata. Encrypted payloads only. | Global edge | No content access. Encrypted payloads. |
| Microsoft Azure | Azure AD for SSO (admin dashboard) | Admin user authentication tokens | US, EU (configurable) | No content access. Auth only. |
Sub-processor change notification
Customers are notified at least 30 days before any new sub-processor is added. Enhanced and Maximum tier customers have the right to object to new sub-processors and terminate the agreement if the objection is not resolved. The current sub-processor list is always available at this URL and via the admin dashboard.
Security questions from CISOs
The ten questions most commonly asked by security teams during procurement review, with thorough answers grounded in our architecture documentation.
No. Email body text, subject lines, and recipient addresses are processed in memory and discarded when the serverless function execution completes. There is no database table, file, cache, queue, or log that holds email content. This is enforced by code architecture (the analysis function has no code path that writes content to any persistent store), verified by automated CI tests (canary string tests, metadata schema tests), and auditable by any customer or third-party assessor (source code review available under NDA).
The only data persisted is analysis metadata: analysisId, timestamp, riskScore, riskLevel, categories[] (names only), riskCount, processingTimeMs, emailCharCount, and truncated. No content, no excerpts, no subject lines, no recipient addresses.
No. We use Anthropic's commercial API and/or AWS Bedrock. Under Anthropic's commercial terms, API inputs are not used for model training and are not retained after the response is generated. Under AWS Bedrock's terms, inputs are not stored or logged, and are not used for model training. Both providers offer contractual commitments to this effect via their enterprise agreements. For customers requiring additional assurance, our Maximum tier runs a local LLM within the customer's own infrastructure with no external API calls.
No. This is not a policy constraint. It is an architectural constraint. There is no system for a Praelyx employee to access email content because email content does not exist in any accessible store. Application logs contain only metadata. The metadata store contains only metadata. The admin dashboard shows aggregate statistics only. In Enhanced and Maximum tiers, content is processed inside hardware-isolated enclaves with no SSH, no debug tools, and no remote shell access. Even a compromised Praelyx employee with root access to our infrastructure cannot extract email content.
We cannot provide what we do not have. If served with a legal request for email content, we can demonstrate that no email content exists in our systems. The most a request could compel is analysis metadata (risk scores, categories, timestamps, hashed user IDs), which contains no email text, subject lines, or recipient addresses. We maintain a quarterly transparency report and monthly warrant canary, both cryptographically signed and timestamped. Praelyx is registered in France (Seraphim SAS, 229 rue Saint-Honore, 75001 Paris), and legal requests are evaluated under applicable French and EU law.
An attacker who compromises our infrastructure would find: (1) Analysis metadata with no content. (2) Application logs with no content. (3) No database of email bodies, subjects, or recipients to exfiltrate. The attack surface is minimal because the attack target (email content) does not exist at rest. In a worst-case real-time attack on a running Lambda function, an attacker could access the content of one email being processed at that instant, within a 1-3 second window, before the function completes and the content is discarded. Enhanced tier mitigates this further with enclave processing where even the host OS cannot access function memory.
In Standard tier: the email body (HTML stripped, max 10,000 characters), subject line, and recipient domains (not full addresses). Sender email address and full recipient addresses are never transmitted. Attachment content is not processed. In Enhanced tier with client-side NER: PII is replaced with typed placeholders ([PERSON_1], [PHONE_1]) before transmission, so personal identifiers never leave the device. The local rule engine handles 50-80% of checks without any network call. In Maximum tier with customer-hosted deployment: nothing leaves your network.
Yes. The admin dashboard supports Azure AD / Okta SSO via SAML 2.0 and OIDC. SCIM provisioning is supported for automated user lifecycle management. Role-based access control separates admin roles: Organization Admin (full config access, aggregate analytics), Department Admin (department-scoped analytics, sensitivity configuration), and Viewer (read-only aggregate dashboards). MFA is enforced for all admin access. Session timeout is enforced. No admin role has access to email content because no content exists in any accessible system.
Our incident response plan follows NIST SP 800-61 guidelines. Detection: automated monitoring via cloud-native alerting, anomaly detection on API usage patterns, and failed authentication monitoring. Triage: security officer evaluates severity within 1 hour. Notification: affected customers notified within 72 hours of confirmed breach (or within 24 hours for high-severity incidents). Remediation: root cause analysis, containment, eradication, recovery. Post-incident: public post-mortem for security incidents affecting customers, with timeline, root cause, and corrective actions. Due to our zero-retention architecture, the impact scope of any breach is inherently limited to metadata.
Yes. We provide multiple audit paths: (1) Source code review under NDA for the analysis function. (2) Infrastructure audit covering serverless configuration, metadata store schema, and log configuration. (3) Canary token testing where your team sends marked test emails and searches for them across our persistent storage. (4) SOC 2 Type II report provided annually. (5) Penetration test results shared under NDA. (6) Customer-hosted deployment for organizations that want to audit their own infrastructure rather than ours. Audit rights are included in all enterprise agreements.
Fundamentally different in architecture and purpose. DLP tools intercept email in transit and maintain persistent content stores for policy enforcement and forensic review. Email monitoring tools continuously analyze and archive employee communications. Praelyx analyzes the email before it is sent, returns a risk assessment to the sender, and discards the content. The sender sees the results. Management sees aggregate statistics (total flags by category, override rates) but never individual emails or per-user flag details. We are a risk spell-check, not a surveillance tool. Our zero-retention architecture means there is no content archive to subpoena, breach, or misuse.
Download security resources
Everything your security team needs to evaluate Praelyx. All documents are available for download or can be requested via your account team.
Ready to evaluate Praelyx?
Request a security briefing with our team. We will walk through the architecture, answer your security questionnaire, and provide access to documentation under NDA.