pluginengine01 / PRODUCTION-READINESS.md
krystv's picture
docs: add production readiness and cross-platform integration audit
538142a verified
# BEX Engine Production Readiness Audit
This document records the current production posture of `pluginengine01` after the HTTP/backend re-audit.
## Current Verdict
The engine is **usable in C/C++ desktop/server apps today** as a portable WASM plugin runtime with:
- Pure C ABI (`bex_engine.h`)
- Async callback API
- Wasmtime Component Model sandboxing
- Per-plugin manifests and capability gates
- HTTP host API with browser-like headers, cookies, compression, HTTP/2
- QuickJS host API for site-specific JavaScript/cipher code
- Redb-backed KV/secrets storage
- FlatBuffer/JSON conversion paths for C++ apps
The engine is **not yet a universal Cloudflare bypass engine**. The default HTTP backend is intentionally portable (`reqwest + rustls`) and does not byte-match Chrome's TLS JA3/JA4 fingerprint.
## Production-Safe Default Backend
Current default:
```toml
reqwest = { version = "0.12", default-features = false, features = [
"rustls-tls", "json", "gzip", "brotli", "deflate", "cookies", "http2"
] }
```
Why this is the default:
- Builds reliably on Linux, macOS, Windows.
- Embeds cleanly into C/C++ apps.
- Avoids native BoringSSL/curl-impersonate build complexity.
- Keeps the repo buildable without unverified crate APIs.
Limitations:
- TLS fingerprint is rustls, not Chrome/BoringSSL.
- Advanced anti-bot systems may still challenge/block.
- Browser-like HTTP headers help with simple checks but do not fix JA3/JA4.
## Cloudflare / Anti-Bot Reality
| Protection type | Default backend | Notes |
|---|---:|---|
| Basic header checks | ✅ | Chrome-like headers, cookies, H2, compression |
| Cookie/session checks | ✅ | `cookie_store(true)` |
| Simple CF managed challenge | ⚠️ | Can pass some; not guaranteed due TLS fingerprint |
| CF JS challenge | ⚠️ | Requires plugin/QuickJS solver and session cookies |
| Turnstile/CAPTCHA | ❌ | Needs user interaction/WebView/browser handoff |
| DataDome/PerimeterX/Akamai bot | ❌/⚠️ | Usually requires Chrome TLS + H2 fingerprint impersonation |
## Optional Hardened HTTP Backend Roadmap
For real browser-grade anti-bot bypass, add an **optional Cargo feature**, not the default:
```toml
[features]
default = ["http-reqwest"]
http-reqwest = ["dep:reqwest"]
http-impersonate = ["dep:rquest"] # or another verified crate
```
Requirements before enabling this in production:
1. Verify crate/version exists and API compiles.
2. Verify Chrome profile enum/import path.
3. Verify Linux/macOS/Windows build in CI.
4. Verify cross-compilation for app targets.
5. Verify TLS fingerprint on every target with an external JA3/JA4 tester.
6. Keep `reqwest` fallback for platforms where impersonation cannot build.
Do **not** ship an unverified `rquest = "1.0"` or undocumented builder API.
## C++ App Integration Status
### Good
- ABI is plain C, no Rust types across boundary.
- Callbacks return `payload` plus length; C++ can copy before return.
- Sync plugin management functions use integer status codes.
- Strings returned by Rust have explicit free functions.
- CMake now supports macOS/Linux/Windows better and no longer forces full static glibc linking.
### Integration Rules
C++ apps must:
1. Keep `BexEngine*` alive until all callbacks complete.
2. Keep `user_data` alive until callback fires or request is cancelled and callback path is known.
3. Copy callback payload before returning from callback.
4. Marshal callback results to the app/UI thread if needed.
5. Call `bex_engine_free` only during shutdown, not while new requests are being submitted.
## Known Issues To Fix Before Hard Production
These are not architecture blockers, but should be addressed before high-volume deployment:
1. **Cancellation tokens should be removed after async task completion.**
- Current code inserts tokens but does not visibly remove them in `submit_async` after completion.
- Long-running apps may leak entries in the cancellation map.
2. **`bex_engine_secret_get` should guard zero-length buffers.**
- If caller passes `*out_buf_len == 0`, `buf_size - 1` underflows.
- Add explicit `if buf_size == 0 { required_len = value.len()+1; return -2; }`.
3. **Scheduler exists but FFI async submit path does not visibly acquire scheduler permits.**
- The architecture has lane semaphores, but `submit_async` currently uses `spawn_blocking` directly.
- Add scheduler acquisition around user/control/background calls for true production backpressure.
4. **HTTP cache key is URL-only.**
- If two plugins request the same URL with different auth headers, cached response can cross-contaminate.
- Make key `plugin_id + method + url + vary-relevant headers`, or disable cache for requests with auth/cookie-sensitive headers.
5. **`call-js-fn` should resolve Promises like `eval-js`.**
- Async JS functions returning promises may stringify as `{}`.
- Reuse the Promise resolution logic from `eval_js` in `worker.rs`.
## Plugin Architecture Assessment
The WIT interface is sufficient for multi-site media plugins:
- home/category/search/info/servers/stream
- subtitles/articles
- HTTP with headers/body/method/cache mode
- KV/secrets
- QuickJS for ciphers/player scripts
- logging/clock/rng
The KaiAnime plugin proves the model works end-to-end:
- HTML scraping
- AJAX endpoints
- self-describing episode IDs
- multi-step token encryption/decryption
- server listing
- stream resolution
- HLS + subtitles
However, KaiAnime currently depends on `enc-dec.app` instead of local QuickJS cipher code. For long-term reliability, move site ciphers into plugin-local JS via `call-js-fn`.
## Cross-Platform Matrix
| Target | Current default backend | Status |
|---|---:|---|
| Linux x86_64 | reqwest/rustls | ✅ Good |
| Linux aarch64 | reqwest/rustls | ✅ Good |
| macOS x86_64/arm64 | reqwest/rustls | ✅ Good, CMake framework links added |
| Windows MSVC | reqwest/rustls | ✅ Likely, system libs added |
| Android | Rust staticlib possible | ⚠️ Needs NDK CI proof |
| iOS | Rust staticlib possible | ⚠️ Needs iOS toolchain + Wasmtime support verification |
| Fully static Linux glibc | Not default | ⚠️ Opt-in only via `BEX_FORCE_STATIC_EXE=ON` |
## Recommendation
For production apps today:
1. Use the current default backend for portability.
2. Add CI for Linux/macOS/Windows release builds.
3. Fix the five known issues above.
4. Add optional impersonation backend later, behind a feature flag, only after compile/API/fingerprint verification.
5. For sites with serious bot defenses, provide app-level browser/WebView handoff as fallback.