Real-World Email Security Performance: 95.20% F1 Score Accuracy

Production Statistics from 32,947 Emails Filtered Across 30 Days

Updated: March 1, 2026

30-Day Analysis: January 30 - March 1, 2026 AI-Powered Multi-Layer Email Security

Executive Summary

OpenEFA is an AI-powered email security platform that uses multi-layered analysis to detect spam, phishing, and malicious emails. Our advanced scoring system combines traditional authentication (SPF, DKIM, DMARC) with AI-powered behavioral analysis, DNS validation, and machine learning to provide industry-leading protection.

Over the past 30 days, OpenEFA has analyzed 32,947 emails with a 95.20% F1 Score and 94.23% precision. The system safely delivered 57.1% to inboxes, quarantined 4.0% for review, and auto-deleted 35.9% as high-confidence spam—all with <2 second processing time. Deployed across 28 protected domains serving 381 recipients, OpenEFA proves that AI-powered email security can deliver enterprise-grade protection at a fraction of the cost.

Key Metrics at a Glance

Metric OpenEFA Value Industry Standard Status
F1 Score 95.20% 85-92% Above Average
Spam Detection Rate 96.43% 90-95% Above Average
False Positive Rate 3.77% 15-25% 85% Better
Precision 94.23% 88-93% Above Average
Emails Processed (30 days) 32,947 N/A Production Scale
Daily Volume ~1,088 emails/day N/A Peak: 1,542 emails/day

Understanding F1 Score: 95.20%

The F1 Score is the single best measure of email security effectiveness, combining both precision and recall into one metric.

What This Means In Practice:
  • Out of 100 spam emails: OpenEFA catches 96
  • Out of 100 emails flagged: 94 are actually spam
  • Balance: Strong precision with high detection rate
Industry Comparison
  • Most commercial solutions: 85-92% F1 Score
  • Barracuda: ~90%
  • Mimecast: ~92%
  • Proofpoint: ~93%
  • OpenEFA (March 2026): 95.20% ✅ Above average performance
F1 Score Breakdown

95.20%

Overall F1 Score

Precision: 94.23%
Recall: 96.43%

Email Processing Breakdown (30 Days)

Disposition Count Percentage Description
Delivered (Safe) 18,827 57.1% Clean emails delivered safely to recipient inboxes
Quarantined (Review) 1,313 4.0% Suspicious emails held for user review and release
Auto-Deleted (Spam) 11,817 35.9% High-confidence spam automatically removed
Released 955 2.9% User-released from quarantine
Total Analyzed 32,947 100% All emails processed by OpenEFA
Protected Infrastructure
Protected Email Domains 28
Protected Recipients 381
Active Users 100+
Blocking Rules 3,096
Unique Sender Domains Analyzed 5,065
Average Spam Scores by Disposition
Delivered Emails 1.10 Low risk
Quarantined Emails 44.34 High-risk spam
Auto-Deleted 54.47 Very high-risk spam
Released -9.12 False positives (trusted)
Overall Average 21.74 System baseline
Key Insight: The 43.24-point difference between delivered and quarantined emails demonstrates excellent separation between legitimate and malicious content.

Confusion Matrix (30-Day Period)

Predicted
Spam Clean
Actual Spam 11,576
True Positive
413
False Negative
Clean 814
False Positive
18,091
True Negative
What These Numbers Mean:
  • True Positives (11,576): Spam correctly identified and blocked
  • True Negatives (18,091): Clean emails correctly delivered
  • False Positives (814): Clean emails quarantined (recoverable)
  • False Negatives (413): Spam that slipped through
Derived Metrics:
  • Accuracy: (11,576 + 18,091) / 32,947 = 95.62%
  • Precision: 11,576 / (11,576 + 814) = 94.23%
  • Recall: 11,576 / (11,576 + 413) = 96.43%
  • Specificity: 18,091 / (18,091 + 814) = 96.23%

Spam Score Distribution (30 Days)

OpenEFA uses a graduated spam scoring system where each email receives a cumulative score based on multiple risk factors. Understanding score distribution helps evaluate system effectiveness and threshold tuning.

Score Range Risk Level Count Percentage Typical Action
0 - 5.9 Safe 17,335 52.6% ✅ Delivered
6.0 - 9.9 Suspicious 1,118 3.4% ⚠️ Quarantined
10.0 - 14.9 High Risk 1,220 3.7% 🛑 Quarantined
15.0+ Very High Risk 13,273 40.3% ❌ Auto-Deleted
Intelligent Thresholds

OpenEFA uses adaptive, multi-factor thresholds to determine email disposition. Emails are classified as delivered, quarantined, or auto-deleted based on cumulative scoring across all analysis modules.

52.6%

Clean Email (Safe)

7.1%

Suspicious (Quarantine)

40.3%

High-Risk Spam (Deleted)

Top Blocked Threat Types

Threat Type Count Description
DNS/Authentication Failures 13,037 SPF/DKIM/DMARC failures
Phishing Attempts 13,037 Credential harvesting, fake login pages
RBL Blocklist Matches 13,014 Known spam sources
BEC (Business Email Compromise) 12,927 Payment requests, wire fraud, executive impersonation
Backscatter/Auto-Reply Spam 1,646 Bounce spam, auto-reply abuse

Machine Learning Performance

OpenEFA's ML ensemble model uses multiple classifiers trained on production email data to provide adaptive spam detection.

Ensemble Model Metrics
Training Samples 8,750
Training Balance 4,375 spam / 4,375 ham
ML Accuracy 81.9%
ML F1 Score 82.7%
ML ROC AUC 91.2%
Features 130
Base Model Performance (ROC AUC)
XGBoost 91.0%
Random Forest 90.0%
Logistic Regression 85.8%
Ensemble Strategy: Multiple models are combined using stacking to achieve higher accuracy than any individual model.

System Performance

<2s

Avg Processing Time

99.9%

System Uptime

~2.5GB

Memory Footprint

5,000+

Daily Capacity

Volume Statistics (30 Days)
  • Daily Average: 1,088 emails/day
  • Peak Day: 1,542 emails
  • Minimum Day: 516 emails
  • Total Processed: 32,947 emails

How OpenEFA Spam Scoring Works

OpenEFA uses a multi-module scoring system where each analysis component contributes to the final spam score. This layered approach provides comprehensive threat detection while minimizing false positives.

1. Email Authentication Module

Validates sender authenticity using industry-standard protocols:

  • SPF: Verifies sending server is authorized
  • DKIM: Cryptographic signature validation
  • DMARC: Policy enforcement
Scoring:
  • ✅ All pass: Score reduced (trusted)
  • ⚠️ Partial: Neutral
  • ❌ Failed: Score increased (high risk)
2. DNS Analysis Module

Advanced DNS validation and domain reputation:

  • RBL Checks: Multiple blocklist sources
  • Domain Spoofing: Multi-domain validation
  • PTR Records: Reverse DNS verification
  • Domain Age: New domain flagging
Scoring:
  • ✅ Clean reputation: No impact
  • ⚠️ Minor issues: Low increase
  • 🛑 RBL listed: Moderate increase
  • ❌ Spoofing detected: Significant increase
3. Phishing Detection Module

AI-powered analysis of phishing indicators:

  • Suspicious URL patterns (shortened, obfuscated)
  • Brand impersonation detection
  • Urgency language analysis
  • Credential harvesting indicators
  • Look-alike domain detection
Scoring:
  • ✅ No indicators: No impact
  • ⚠️ Low confidence: Low increase
  • 🛑 Medium confidence: Moderate increase
  • ❌ High confidence: Significant increase
4. Business Email Compromise (BEC)

Detects executive impersonation and wire fraud:

  • Display name spoofing detection
  • Payment request indicators
  • Urgency/secrecy language analysis
  • Executive title spoofing
Scoring:
  • ✅ No BEC indicators: No impact
  • ⚠️ Low confidence: Low increase
  • 🛑 Medium confidence: Moderate increase
  • ❌ High confidence: Significant increase
5. Behavioral Analysis Module

Analyzes sender behavior patterns and anomalies:

  • First contact detection
  • Sender reputation analysis
  • Graph-based relationship analysis
Scoring:
  • ✅ Normal behavior: No impact
  • ⚠️ Minor anomalies: Low increase
  • 🛑 Significant anomalies: Moderate increase
  • ❌ Severe anomalies: High increase
6. ML Ensemble Module

Adaptive learning from user feedback:

  • Multi-model ensemble voting
  • Confidence-weighted adjustments
  • Learns from released emails (false positives)
  • Learns from deleted spam (true positives)
Scoring:
  • ✅ Ham prediction: Score reduced
  • ⚠️ Uncertain: No impact
  • ❌ Spam prediction: Score increased

How OpenEFA Compares

Metric OpenEFA Barracuda Mimecast Proofpoint
F1 Score 95.20% ~90% ~92% ~93%
Spam Detection 96.43% ~95% ~96% ~97%
Precision 94.23% ~89% ~91% ~94%
False Positive Rate 3.77% ~12% ~10% ~8%
Cost (50 users/year) $199-799 ~$3,000 ~$4,800 ~$7,200
Privacy-First AI ✅ Yes ❌ No ❌ No ❌ No
Key Advantages
  • ✅ Above-average accuracy (95.20% F1 Score)
  • ✅ Strong precision (94.23%)
  • ✅ Low false positive rate (3.77%)
  • ✅ 60-80% cost savings vs. commercial
  • ✅ Full transparency (detailed scoring)
  • ✅ Data sovereignty (self-hosted)
  • ✅ No vendor lock-in
  • ✅ Continuous learning system

Data Quality & Methodology

Measurement Period
  • Start Date: January 30, 2026
  • End Date: March 1, 2026
  • Duration: 30 days
  • Total Emails: 32,947
  • Environment: Production deployment (28 domains, 381 recipients)
Classification Methodology
  • Spam Threshold: Score ≥ 18.0
  • Clean Threshold: Score < 6.0
  • Validation: User quarantine actions (releases)
  • Source: Production MySQL database
Why These Numbers Matter

This 30-day period represents OpenEFA's production performance with fully operational detection modules including multi-module spam scoring with 20+ detection components, AI-powered NLP analysis using spaCy en_core_web_lg, machine learning ensemble with adaptive learning, and real-time DNS and authentication validation.

Note: These statistics represent real production data from OpenEFA deployments across multiple client domains. All metrics are verifiable and reproducible from the source database.

Ready to Experience These Results?

Join organizations worldwide protecting their email with OpenEFA's AI-powered security.