Plantry
Executive Summary
Vision
Build a mobile app that turns a single photo of a user's pantry or refrigerator into immediate, practical meal recommendations so users stop fretting over what to cook and can act in minutes.
Target users
Primary: Working parents with school-age kids who need quick, family-friendly dinners on weeknights.
Secondary: Single professionals or couples cooking on weeknights, and stay-at-home parents who want faster meal decisions.
Problem being solved
Users experience decision fatigue at mealtime and lack time to plan.
Manually typing ingredient lists is slow, error-prone, and breaks momentum; that friction causes skipped meals or takeout decisions.
How success feels for users
A user snaps a photo, sees several cookable recipes ranked for time and family-fit, and starts cooking within minutes — no typing, no second-guessing.
Key user outcomes
- Faster meal decisions on weeknights.
- Lower cognitive load at dinner time.
- Higher utilization of ingredients on hand (fewer trips to the store).
What Makes This Special
- Photo-first input - the core insight is removing manual entry: a single image becomes the canonical source of truth for available ingredients.
- Immediate, contextual recommendations - recipes are filtered and ranked by what can be cooked now with what’s visible in the photo, prioritizing time-to-table and family suitability.
- UX wedge: camera-to-recipe flow - the instant, reliable "photo in -> recipes out" interaction collapses the hardest part of meal planning and creates a delightful, repeatable habit.
- Practical accuracy over novelty - prioritizing pragmatic recipe matches and cook time beats exotic suggestions; users choose predictability and speed on busy nights.
Project Classification
projectType: "mobile app"
domain: "general"
complexity: "medium"
projectContext: "greenfield"
Success Criteria
User Success
- Primary outcomes: Users feel fast relief and confidence when deciding what to cook from a single photo. This is measured by users starting to cook within a short time window and reporting that a recommended recipe was usable with no extra shopping.
- Key user moments: the "aha" is when a user snaps a photo and sees immediately actionable recipes that match only the visible ingredients and fit family/time constraints. Success is the user tapping "Start" or equivalent and beginning to cook without searching for missing items.
Business Success
- Early traction target: reach 5,000 weekly active users (WAU) by the end of month 3. This is the stated growth milestone for early market validation.
- Engagement & retention: track 7-day and 30-day return rates, plus weekly sessions per active user. These metrics indicate habit formation and product value. Exact percentage targets for retention should be defined with cohort data post-launch, but tracking must be in place from day one.
- Engagement depth: monitor WAU × average weekly sessions per active user as a leading indicator of sustained usage and virality potential.
Technical Success
- Recipe match quality: measurable match rate ≥ 70%, defined as the percentage of recommended recipes that users mark (or implicitly confirm by starting) as usable without needing to buy extra ingredients. This is the prime technical success signal.
- Time-to-action: median photo → start cooking ≤ 3 minutes as the target for a smooth camera-to-recipe flow. This measures perceived speed and convenience.
- Ingredient scope constraint: MVP will match recipes using only ingredients visible in the photo to set a strict, testable boundary for the matching model.
- MVP posture: prioritize match accuracy-first (meet the ≥70% match rate) even if initial time-to-action is higher; later iterations optimize latency and UI flow.
Measurable Outcomes
- Match rate (primary): ≥ 70% usable-recipe rate, measured by explicit user feedback (usable / not usable) and by conversion to "start cooking" within the session.
- Speed (primary): median photo→start ≤ 3 minutes, measured from image submission to explicit user action indicating cooking has begun (tap start, open full recipe, or similar signal).
- Growth (primary): 5,000 WAU by month 3.
- Retention (secondary): instrument 7-day and 30-day retention; define numerical targets after first cohort, with engineering and analytics readiness required at launch.
- Engagement (secondary): weekly sessions per active user and WAU × average weekly sessions as a composite engagement depth metric.
Product Scope
MVP - Minimum Viable Product
- Must-haves:
- Photo-first capture flow (single-photo input) with streamlined camera UX.
- Recipe matching engine constrained to ingredients visible in the photo.
- Recipe ranking by time-to-table and family-fit filters.
- Clear start-cooking CTA and simple feedback capture (usable / not usable).
- Basic analytics to measure match rate, time-to-action, WAU, and retention cohorts.
- MVP success criteria: recipe match rate ≥ 70% (primary acceptance). Time-to-action measurement in place; reducing median time is a priority post-accuracy threshold.
- MVP tradeoff: accept longer time-to-action if it materially improves match accuracy during initial experiments.
Growth Features (Post-MVP)
- Optimize time-to-action toward median ≤ 3 minutes through UI and model improvements.
- Relax strict-visibility rules with optional pantry history or user-confirmed inferred ingredients to increase applicable recipe matches where appropriate.
- Personalization (taste/family preferences), saved pantry, smart grocery list export, and repeat-recipe suggestions to increase retention.
- A/B experimentation framework to test ranking heuristics and family-fit signals.
Vision (Future)
- Full meal-planning workflow across multiple meals and days, integrated grocery ordering, multi-photo or video-based pantry scans, and cross-device persistence.
- Community and social features for sharing family-friendly recipes and curated menus.
- External integrations and APIs for recipe partners and retail/listing partners.
User Journeys
1) Working parent - Primary success path (Happy path)
Persona: Mara, working parent with school-age kids, one hour until dinner and no plan.
Opening scene: Mara arrives home, wants dinner in an hour with minimal hassle. The app defaults to a camera-first entry but also surfaces a clear manual entry/last-used recipes shortcut.
Rising action: Mara takes a single photo of the pantry and fridge. The app uploads the image, displays a brief progress indicator and a privacy reminder about image handling, then opens the Suggestions screen with ranked recipes prioritized by time-to-table and family-fit.
Climax: A clear recipe appears that uses visible, high-confidence ingredients. Mara taps Start and follows concise step cards in cook mode.
Resolution: By mealtime the family is eating a home-cooked meal and Mara feels relieved and confident.
High-level screens and choices: Camera (default) → Suggestions (ranked, confidence visible) → Recipe Detail → Start Cooking → Feedback (usable / not usable).
2) Working parent - Edge case (missing ingredient or low-confidence detection)
Situation: Photo is cluttered or the app misses a key ingredient. Mara needs a fast, low-friction recovery so dinner still happens.
Rising action: Suggestions show low-confidence matches or recommend recipes that would require a store run. The UI highlights inferred items with confidence levels.
Climax: The app offers a single, minimal confirmation step: a compact Ingredient Confirmation modal that highlights the inferred item(s) and gives quick options—Confirm, Suggest Substitutes, or Manual Edit—without forcing a full re-scan.
Resolution: Mara confirms a substitute or selects an alternate quick recipe and begins cooking without leaving the house.
High-level screens and choices: Camera → Suggestions (low-confidence state) → Ingredient Confirmation modal (Confirm / Substitute / Edit) → Revised Suggestions → Recipe Detail → Start Cooking.
3) Secondary user - Single professional / couple (quick solo cooking)
Persona: Alex, single professional cooking for one, values speed and minimal cleanup.
Opening scene: Alex takes a quick pantry photo after work or opens a "Quick Meals" shortcut, looking specifically for a 20-minute meal.
Rising action: Suggestions are ranked with a strong bias for short prep, minimal cleanup, and single-portion scaling; confidence and portion-size options are visible at glance.
Climax: Alex picks a recipe, taps Start, and follows a condensed cook mode with only essential steps and scaled ingredients.
Resolution: Meal is ready quickly and portion sizes avoid waste.
High-level screens and choices: Camera / Quick Shortcut → Suggestions (time and portion filters) → Recipe Detail (scale portions) → Start Cooking (compact mode).
4) Admin / Operations user
Persona: Ops manager who monitors model performance and content quality.
Opening scene: Ops receives daily alerts about drops in match-rate or spikes in low-confidence detections and opens an Admin Dashboard.
Rising action: They review analytics, sample low-confidence images (with user consent/retention rules applied), and inspect recipe metadata and tagging that contributed to mismatches.
Climax: Ops triages issues: tag corrections, content removals, or prioritized model retraining. Actions are traceable to session logs and versioned content edits.
Resolution: Match-rate improves and user complaints fall as content and model changes propagate.
High-level screens and choices: Admin Dashboard (analytics, image samples subject to privacy policy) → Content Editor → Model/Experiment Controls.
5) Support / Troubleshooting user
Persona: Customer support rep resolving a user report that a recommended recipe was unusable.
Opening scene: A user submits feedback “recipe unusable” with the original photo attached.
Rising action: Support pulls the session bundle (image, inferred ingredients with confidence, matched recipe, and the session timeline). Privacy flags are visible when retention limits apply.
Climax: Support offers a remediation path—in-app help, guidance, or escalation to Ops—with the case bundle attached so Ops can reproduce the issue.
Resolution: The issue is documented, the user is satisfied or refunded, and the case may trigger metadata fixes or training data changes.
High-level screens and choices: Support Console (ticket + image + session log + confidence metadata) → Case Actions (reply / escalate / refund) → Link to Ops.
6) API / Integration developer (if applicable)
Persona: Partner developer integrating the match API to submit images and receive recipe suggestions.
Opening scene: Dev registers for an API key and reads the docs for image submission payloads and expected response fields, including confidence and tag metadata.
Rising action: They POST image data (or image reference), receive ranked recipe matches and confidence scores, and debug using response logs and standard error codes.
Climax: Integration goes live and pushes usage metrics to a partner dashboard.
Resolution: The partner reliably surfaces recipe matches in their app and monitors quota and errors.
High-level screens and choices: Developer Portal → API Keys → Request/Response Logs → Webhooks / Monitoring.
Journey Requirements Summary
Capture & UX
- Camera-first capture flow as the default but explicit alternate entry points (manual ingredient entry, last-used recipes, Quick Meals shortcut) so users always have a fast path.
- Minimal friction: single-photo upload with progress and a short privacy notice; retry guidance and an inline, low-cost confirmation step when detection confidence is low.
- Suggestions screen that shows ranked recipes, visible confidence for inferred items, and quick filters (time, family-fit, portions).
- Compact Ingredient Confirmation UI for low-confidence matches that supports Confirm / Substitute / Edit quickly.
- Recipe Detail and Start Cooking flow with clear CTAs and a compact cook mode for quick recipes.
- Simple feedback capture (usable / not usable) tied to session logs and image references.
Model & Data
- Image processing pipeline that returns detected items with per-item confidence and substitution suggestions.
- Recipe matching engine that ranks by time-to-table and family-fit and can re-rank immediately after user confirmation.
- Robust logging of image, inferred items and confidence, session steps, and matched recipe metadata for reproducibility.
Support & Ops
- Admin dashboard for analytics, image sampling (subject to retention/privacy rules), and content moderation workflows.
- Support console with access to session bundles (image + inferred items + confidence + timeline) for fast troubleshooting.
- Content management tools to edit recipe metadata and remove or re-tag poor matches; all edits are versioned and linked to incidents.
Developer / Integrations
- API endpoints for image submission and recipe responses that include confidence and tag metadata; clear docs, auth, and logging.
- Developer portal with keys, docs, and request/response playground.
Product & Experience
- Explicit fallbacks and low-friction recovery when detection is low: confirmation modal, substitution suggestions, alternate quick recipes, and manual edit.
- Portion scaling and dietary filters to support different household sizes and preferences.
- Clear privacy controls and visible image retention rules presented at capture and accessible from support/ops.
Metrics surfaced by journeys
- Match-rate and usable feedback per session.
- Time-to-action (photo → start cooking) per user segment.
- Admin metrics: low-confidence incidents, support escalations, and content change impact.
Innovation
What’s novel
- Vision-first ingredient detection as the primary wedge: treating a single photo as the canonical pantry/fridge state (no manual typing), enabled by a high-precision ingredient-recognition model tailored to messy, real-world kitchen imagery.
- Camera-to-recipe co-design: the model and UX are designed together so visual confidence drives immediate, actionable recipe suggestions that collapse decision time — not just “identify items,” but reliably enable a user to start cooking from one photo.
- Strict visibility constraint (MVP posture): matching recipes only to ingredients seen in the image to create predictable, practical recommendations and reduce cognitive overhead or surprise shopping trips.
Core assumptions we must validate
- Vision accuracy is good enough in real-world conditions (lighting, occlusion, mixed packaging) to power usable recipe matches without manual correction.
- Users will prefer and adopt a camera-first flow over typing or barcode/manual pantry entry when the end-to-end experience is fast and reliable.
- Confidence signals from the vision model can reliably identify when to ask for a lightweight confirmation vs. show recipes directly (i.e., confidence calibration matters).
- Privacy and latency tradeoffs (on-device vs. cloud inference) will influence adoption — users accept image capture if handled transparently and safely.
Key technical challenges
- Real-world image variability: cluttered shelves, overlapping items, partially obscured labels, identical packaging across brands, and prepared food versus raw ingredients.
- Granularity and taxonomy: deciding detection targets (e.g., “tomato” vs. “canned tomatoes” vs. “tomato sauce”) and supporting substitutions safely.
- False positives cost: an incorrectly detected item can yield unusable recipes and erode trust faster than missed detections.
- Quantity and usability: detecting presence is different from inferring sufficient quantity for a recipe; quantity estimation is noisy and optional for MVP.
- Latency and model size: balancing on-device inference for privacy and speed against accuracy limits and update cadence.
- Dataset scarcity: lack of large, labeled datasets of pantry/fridge photos across diverse households and cultural cuisines.
Validation experiments (what to test first and how)
- Offline vision benchmarks: build a representative test set of real pantry/fridge photos; measure per-item precision, recall, and confidence calibration across common ingredient classes and edge cases.
- Usability pilots: in-home or lab tests where participants use the camera flow to get recipes; collect binary usable/not-usable feedback and qualitative failure modes to link vision errors to recipe failures.
- Confidence thresholding experiments: tune when the system shows recipes directly vs. triggers the compact Ingredient Confirmation modal; measure impact on usable-recipe rate and time-to-action.
- Ablation tests: compare strict visible-only matching vs. allowing 1–2 inferred pantry items (from user history or probabilistic inference) to quantify tradeoffs between match rate and user surprise/shopping.
- On-device vs. cloud trials: pilot both deployment modes to measure latency, energy, privacy perception, and model accuracy tradeoffs.
Key validation signals to track (qualitative and quantitative):
- Per-item precision (to reduce false positives).
- Confidence calibration (probability that high-confidence detections are correct).
- Rate at which vision errors lead to “recipe unusable” feedback in pilots.
- User acceptance of photo capture and perceived privacy risk in real tests.
Design and UX implications
- Confidence-driven UX: surfaced per-item confidence, and a compact, low-friction Ingredient Confirmation modal for low-confidence detections (Confirm / Substitute / Edit) to avoid interrupting flow.
- Conservative matching for MVP: prefer omissions over false positives — better to miss an ingredient than falsely include one that breaks a recipe.
- Fast feedback loop: lightweight, in-app feedback (usable / not usable) tied to the original image to close the loop for model improvement.
- Progressive disclosure: show clear indicators when recipes require items of uncertain detection and offer quick substitutes or alternative recipes.
Tradeoffs and alternatives considered
- Strict visible-only matching (MVP) vs. inferred pantry augmentation: strict matching increases predictability; inferred augmentation can raise match rates but risks recommending recipes that require new shopping.
- On-device inference for privacy/latency vs. cloud for model capacity and update speed: on-device improves privacy and responsiveness; cloud enables larger models and faster iteration.
- Single-photo capture vs. multi-photo / short video scanning: single-photo minimizes friction; multi-photo improves coverage but increases complexity and user effort.
- Vision-only approach vs. hybrid inputs (barcode/OCR/manual): hybrid reduces vision risk at the cost of reintroducing input friction — a candidate post-MVP path.
Operational & privacy considerations
- Build consented image sampling and tooling for ops: enable safe, privacy-compliant image inspection and labeling (user opt-in for images used in training).
- Instrument robust logging (image metadata, detections with confidence, user feedback) to trace failures and prioritize retraining.
- Consider privacy-preserving strategies early (on-device inference, ephemeral uploads, automatic redaction) to increase user trust and reduce legal/ops burden.
How this innovation maps to the MVP posture
- Primary focus: deliver a high-precision, vision-first detection pipeline that produces conservative, high-confidence ingredient lists used to match recipes strictly to visible items.
- UX safeguards: confidence-driven confirmation and quick substitutes prevent small vision errors from breaking the core promise.
- Measurement-forward: early validation emphasizes linking vision errors to real user outcomes (usable recipes, time-to-action) so future investment prioritizes the parts with greatest user impact.
Mobile App Specific Requirements
Platform Support
- Decision: Native iOS (Swift) first, target iOS 16+ on iPhone 12 and newer for MVP to guarantee modern camera and network capabilities. Android will be a native port after iOS launch.
- Backend architecture: Cloud-only inference with server-side large models to prioritize accuracy and fast iteration. Design APIs and response contracts to be platform-agnostic to ease the Android port.
- Latency target: end-to-end photo→start median target 2–3 minutes for MVP; set server SLAs and client timeouts to meet this.
Device Capabilities
- Permissions required: Camera access and Photo Library access (allow selecting existing images). Capture UI must show explicit consent and the ephemeral-upload privacy notice at the point of capture.
- Image handling: Support HEIC and JPEG input; perform lightweight client-side resizing/compression before upload to limit bandwidth while preserving model accuracy.
- Hardware assumption: iPhone 12+ ensures consistent camera sensor and network characteristics; use this to size image preprocessing and expected upload latency.
Offline Strategy
- MVP posture: No offline inference support - image submission and recipe matching require network connectivity (cloud-only inference).
- Client UX requirements: clear offline state, retry queue for pending uploads, and a lightweight manual-ingredient entry fallback so users can continue when offline.
- Resilience: define conservative timeouts and exponential retry logic to avoid blocking the UI and to preserve the 2–3 minute experience goal where possible.
Push Notifications
- MVP decision: No push notifications in the MVP. Omit push infrastructure and avoid requesting notification permission on first release.
- Future note: defer engagement and reminder flows until after Android port and stabilization of core vision-match metrics.
Store Compliance
- Distribution: Global (US + EU) via App Store and later Google Play; comply with standard store review policies.
- Privacy & GDPR: privacy policy must state ephemeral uploads and immediate deletion after inference; implement consent flows and transparent UI language at capture and in settings.
- Data controls & contracts: prepare Data Processing Agreement and vendor reviews; consider EU data residency options and minimal logging to meet GDPR obligations.
Project Scoping & Phased Development
MVP Strategy & Philosophy
MVP Approach: Problem-solving MVP - deliver a single, reliable photo→suggestions→start workflow that eliminates decision fatigue for the working-parent happy path. Prioritize accuracy-first (match-rate ≥ 70%) even if time-to-action is higher initially.
Resource Requirements: Solo iOS developer (full-stack on client and simple orchestration) + lightweight cloud inference (managed ML endpoints). Early manual ops for labeling and triage (single operator/contractor or the developer for initial weeks). Minimal analytics and logging instrumented by the developer; no dedicated backend team required at launch.
MVP Feature Set (Phase 1)
Core User Journeys Supported:
- Working parent primary happy path: Camera → Suggestions → Start Cooking
- Low-friction recovery for low-confidence matches via a compact Ingredient Confirmation modal
- Simple feedback loop: usable / not usable tied to the original photo
Must-Have Capabilities:
- Camera-first single-photo capture flow with explicit privacy notice and lightweight client-side resizing/compression
- Cloud inference endpoint that returns detected ingredients with per-item confidence
- Recipe matching constrained to ingredients visible in the photo (strict-visibility rule)
- Ranking by time-to-table and family-fit to surface practical recipes
- Compact Ingredient Confirmation modal (Confirm / Suggest Substitutes / Manual Edit) for low-confidence detections
- Clear Start Cooking CTA and compact cook mode for the selected recipe
- Usable/not-usable feedback capture associated with the session and image
- Minimal analytics to compute match-rate, time-to-action, and basic WAU/retention hooks
- Manual ops workflow for sampling images, labeling errors, and feeding initial training data
Post-MVP Features
Phase 2 (Post-MVP):
- Time-to-action optimizations (UI flow improvements and lighter latency targets)
- Personalization and saved preferences (taste/family settings) and portion scaling and dietary filters
- Pantry history / optional inferred ingredients (user-confirmed) to boost match coverage
- Basic admin dashboard for sampling and triaging low-confidence incidents (semi-automated)
- A/B experiments for ranking heuristics and confidence thresholds
- Analytics instrumentation refinement (cohort retention, funnels)
Phase 3 (Expansion):
- Android native port and cross-device persistence
- On-device inference options for privacy/latency where feasible
- Grocery ordering integration and multi-meal planning workflows
- API/partner integrations and developer portal
- Community features, sharing, and richer personalization engines
- Automated ops: continuous labeling pipelines, model retraining orchestration, and production monitoring
Risk Mitigation Strategy
Technical Risks:
- Vision accuracy in messy, real-world photos is the riskiest technical assumption.
- Mitigation: conservative detection taxonomy (broader classes), strict visible-only matching for MVP, build a representative test set and run offline benchmarks before broad release, and tune confidence thresholds to gate direct recipe presentation.
- Latency and model iteration speed (cloud inference tradeoffs).
- Mitigation: use managed cloud ML endpoints for fast iteration; accept initial higher time-to-action while improving model and client-side UX.
- Dataset scarcity and labeling overhead.
- Mitigation: start with manual ops labeling, lightweight consented image sampling, and iterative active-learning cycles focused on high-impact failure modes.
Market Risks:
- Users may not adopt camera-first flow if privacy or accuracy concerns persist.
- Mitigation: pilot usability tests with working parents, explicit privacy UI and ephemeral upload language, and focus early experiments on perceived usefulness (usable feedback) rather than broad feature parity.
- Early retention risk if suggestions feel unreliable.
- Mitigation: favor conservative matches, use the Ingredient Confirmation modal to keep flow moving, and instrument usable/not-usable signals to prioritize fixes.
Resource Risks:
- Solo-builder limits velocity and parallel work (iOS + cloud + ops).
- Mitigation: shrink scope to camera→suggestions→start + confirmation + feedback; use managed services (serverless functions, managed ML endpoints, Firebase/Amplify for auth and analytics) to avoid building backend plumbing; defer automated dashboards and heavy analytics to Phase 2; consider short-term contractor for ops/labeling if budget permits.
- If fewer resources remain, further reduce scope to an MVP without confirmation modal (strict matches only) or a lab/pilot release with small invited user cohort.
Functional Requirements
Capture & Input
- FR1: End users can capture and submit a single photo of their pantry or refrigerator from within the native mobile app.
- FR2: End users can select an existing photo from the device photo library and submit it for suggestions.
- FR3: The system displays an explicit privacy notice and obtains consent at the moment of capture or upload.
- FR4: End users can retry a failed upload or resubmit a new photo from the capture flow.
Image Detection & Taxonomy
- FR5: The system returns a per-photo list of detected items (ingredients) with a per-item confidence score.
- FR6: The system maps each detected item to a canonical taxonomy ID from the product ingredient taxonomy.
- FR7: The system exposes per-item metadata describing detection evidence (e.g., bounding region, detection tags) for downstream triage and UX display.
- FR8: The system can flag low-confidence detections and mark them for follow-up UX flows.
Recipe Matching & Ranking
- FR9: The system matches recipes only to ingredients detected in the submitted photo (strict-visibility rule) for MVP.
- FR10: The system produces a ranked list of candidate recipes for each photo submission.
- FR11: Each recommended recipe includes a per-recipe confidence score and a short explanation of the match reason (e.g., matched ingredients, time-fit, family-fit).
- FR12: The system supports re-ranking suggestions immediately after user confirmation or edit of inferred ingredients.
- FR13: The system provides substitution suggestions for low-confidence or missing detected ingredients where applicable.
- FR14: The system supports ranking signals for time-to-table and family-fit and applies them when ordering recipe results.
Recipe Data & Metadata
- FR15: Recipes include structured metadata: ingredient lists (canonical IDs), prep time, cook time, portion sizes, dietary tags (e.g., allergens), and family-fit attributes.
- FR16: Recipes expose portion-scaling information that the app can use to scale ingredient quantities and steps.
- FR17: Recipes expose dietary and allergen tags that can be used as filter criteria during ranking and user preference enforcement.
Cook Flow & Feedback
- FR18: End users can open a recipe detail from the suggestions list and tap a clear Start Cooking CTA.
- FR19: The system provides a compact cook mode with step-by-step instructions optimized for quick-start recipes.
- FR20: End users can submit simple feedback tied to the session and source photo indicating whether the recommended recipe was usable (usable / not usable).
- FR21: The system logs the session timeline, the submitted photo reference, detected items with confidence, chosen recipe, and feedback for each user session.
User Preferences, Portioning & Filters
- FR22: End users can specify basic preference filters (e.g., family-friendly, vegetarian, allergies) that influence ranking and filtering.
- FR23: End users can choose desired maximum total time-to-table (e.g., 20 minutes, 30 minutes) which the matching engine uses when ranking.
- FR24: End users can select or adjust portion scaling on a recipe before starting cook mode.
Analytics, Metrics & Instrumentation
- FR25: The system records and reports match-rate metrics (usable-recipe rate) per cohort to support the ≥70% success criterion.
- FR26: The system measures and records time-to-action (photo submitted → Start Cooking) per session and per cohort.
- FR27: The system tracks per-item detection precision and per-recipe conversion metrics to support ops triage and model improvement.
- FR28: The system provides event-level logging that ties image uploads, detections, recipe matches, user confirmations/edits, and feedback.
Admin, Ops & Support Tools
- FR29: Admin users can access aggregated analytics dashboards that surface low-confidence incident rates, match-rate trends, and cohort retention.
- FR30: Admin/ops users can sample consented images and view associated detection metadata and session logs for triage and labeling.
- FR31: Support users can open a case that includes the original photo, detected items with confidence, the matched recipe, and the session timeline.
- FR32: Admin users can edit recipe metadata (tags, ingredient mappings) with changes versioned and traceable to the editor and timestamp.
Integrations & Developer APIs
- FR33: The system exposes an authenticated API endpoint that accepts an image payload and returns ranked recipe matches with per-item and per-recipe confidence metadata.
- FR34: The system exposes API response fields for matched recipe IDs, match reasons, ingredient canonical IDs, and suggested substitutions.
Platform, Deployment & Privacy
- FR35: End users can access the product via a native mobile app (iOS first for MVP); web/desktop clients are not provided in the MVP.
- FR36: The system implements ephemeral image handling for inference: images used only for matching are deleted according to the product privacy policy unless the user explicitly opts into training/ops sampling.
- FR37: The mobile client requests and enforces required permissions (camera and photo library) and surfaces consent language during capture.
Quality & Safety Controls (functional)
- FR38: The system can mark and surface recipes that depend on low-confidence detections so the UX can require lightweight confirmation before presenting them as usable.
- FR39: The system provides substitution and quick alternate-recipe suggestions when a required detected ingredient is low-confidence or absent.
Operational Capabilities
- FR40: The system supports a manual ops labeling workflow where sampled images and corrections can be added as labeled training data.
- FR41: The system supports exportable logs and incident bundles to enable reproduction of low-confidence failures by engineering and ops.
Non-Functional Requirements
Performance
Goal: Meet the product-level target of an end-to-end median photo→start cooking time ≤ 3 minutes while prioritizing model accuracy.
Overall target
- End-to-end median (photo submission → explicit user Start Cooking) ≤ 3 minutes.
- End-to-end p95 target ≤ 5 minutes to limit long-tail user pain.
Per-component measurable targets
- Client upload (client → server): median ≤ 10s, p95 ≤ 15s.
- Vision inference (image → detected items): median ≤ 20s, p95 ≤ 30s.
- Recipe matching and re-ranking (after confirmation): median ≤ 3s, p95 ≤ 5s.
Measurement and SLOs
- Instrument event timestamps at each component boundary (upload start, upload complete, inference complete, match returned, Start tapped) to compute median/p95 SLOs per cohort.
- Alert if any component misses its SLO for two consecutive deploys or if end-to-end median rises >20% vs baseline.
Reliability
Goal: Keep the core flow dependable for users so accuracy experiments do not fail due to instability.
- API availability (inference + matching endpoints): target SLA 99.5% monthly uptime.
- Request error rate (5xx responses) ≤ 1% overall; alert at >0.5% sustained for 10 minutes.
- Client-side timeouts and retry behavior: client upload timeout 60s, retries use exponential backoff with max 3 attempts, and queued uploads survive app backgrounding where feasible.
- Offline resilience: surface a clear offline state and provide manual ingredient entry fallback so users can continue when network unavailable.
- Zero-downtime model updates: deploy inference model versions with traffic-shift staged rollouts and automated canary checks on match-rate and usable feedback.
- Monitoring and incident runbooks: automated dashboards for latency, error rate, match-rate, and usable/not-usable feedback; on-call alerting tied to runbooks for inference failures and high-confidence drops.
Security & Privacy
Goal: Protect user images and personal data while enabling consented ops sampling for model improvement.
- Ephemeral image handling: images used only for real-time matching must be deleted immediately after inference completes unless the user explicitly opts into sampling for ops/training.
- Retention for sampled images: explicit opt-in required; retention TTL ≤ 90 days and deletion audited.
- Encryption: TLS 1.2+ for transport; encryption at rest using strong ciphers (e.g., AES-256) for any stored images or sensitive metadata.
- Access control and audit trails: least-privilege access for ops/admin tools, role-based access control, and immutable audit logs for image access and edits.
- Consent and data subject rights: capture user consent at capture, surface privacy controls in settings, and support GDPR data subject requests within required windows via documented DPA processes.
- Minimal metadata logging: avoid storing unnecessary PII in logs; when logs include identifiers, ensure they are access-controlled and time-limited.
Scalability
Goal: Support the business growth targets in the PRD while maintaining performance SLOs.
- Initial capacity target: support 5,000 weekly active users (WAU) by month 3 while meeting performance SLOs.
- Growth headroom: architecture must scale to at least 10x initial WAU (target 50,000 WAU) with <10% median-latency degradation.
- Autoscaling and burst handling: inference and matching services must autoscale horizontally and tolerate traffic bursts up to 3x baseline sustained for short windows without violating SLOs.
- Capacity testing: include load tests that validate median and p95 SLOs at baseline and at 3x and 10x baseline traffic levels before major releases.
- Deployment elasticity: support rapid rollback and staged rollouts to limit blast radius when model changes cause latency or accuracy regressions.
Integration (API) Requirements
Goal: Third-party and internal integrations must be predictable and performant for partners and admin tooling.
- API response contract: inference API returns detected items with per-item confidence, matched recipe IDs with per-recipe confidence, and match-reason metadata within the server processing budget (vision inference + matching ≤ 25s server-side budget under normal load).
- Versioning and backward compatibility: all public/internal endpoints must be versioned; breaking changes require a migration window and documentation.
- Performance expectations: server-side processing (inference + matching) should meet the per-component targets above (median inference ≤ 20s; matching ≤ 3s).
- Auth, rate limiting, and observability: authenticated requests (API keys/OAuth), per-key rate limiting with clear 429 behavior, and request-level logging to reproduce incidents (respecting privacy rules for images).
Measurement & Acceptance
- Each NFR must map to a metric with instrumentation (event timestamps, SLO dashboards, error budgets) and acceptance criteria tied to the product success signals (match-rate and time-to-action).
- MVP acceptance: end-to-end median photo→start ≤ 3 minutes in production for the invited cohort or pilot; component SLOs may be relaxed during early experiments but must be tracked and improved post-pilot.
- Operational readiness checklist: monitoring dashboards, alerting rules, incident runbooks, and privacy-compliant ops sampling flows must be in place before broad rollout.
Build Starting Point
Start by building the camera capture flow and image upload pipeline — this is your riskiest assumption and the UX wedge that defines the product. Wire the iOS camera permission request, capture UI, client-side image compression, and a basic cloud endpoint that accepts the image and returns a dummy ingredient list. This unblocks two critical paths in parallel: your team can immediately test real pantry photos to validate vision feasibility (offline benchmarking), and designers/PMs can iterate the Suggestions and Ingredient Confirmation UX with realistic data without waiting for the full matching engine. Once you confirm vision accuracy is plausible (even with a simple rule-based baseline) and users feel fast relief at seeing *any* ranked recipes, you'll have de-risked the core insight and can confidently build the recipe matching and ranking logic that follows.