Project: doctolib-doctolib · Branch: master · Mode: · Model: claude-sonnet-4-5-20250929 Stats: 1 user messages, 30 tool calls, tools used: Read, Glob, Grep, Bash. Conversation Turn 1 — User Based on comprehensive codebase exploration, design a solution for the following issue: Problem With the upload_document_write feature switch enabled: - First attempt to view documents ("dokumente anzeigen") fails - Second attempt works - The document_url for decryptedFileUrl is different between attempts - Cannot test locally, only on staging (S3 bucket for v2 not configured in dev) Root Cause Analysis 1. Upload Flow with Feature Switch: - Frontend uses UPLOAD_DOCUMENT_FEATURE_SWITCH_OVERRIDE (mode: 'double-write') - buildDocumentEndpoint() returns /doctor_app/v2/documents when feature switch is enabled - Documents are uploaded via POST to v2 endpoint 2. V2 Endpoint Behavior: - Uses Documents::V3::DocumentService.create_document - Uses atomic 2-phase commit pattern with Documents::DocumentUpload marker - Uploads to both documents_server_encrypted (SSE) AND documents (legacy) S3 buckets - Uses Documents::V2::AtomicUploadService + Documents::V3::S3UploadService 3. Race Condition: - V2 POST response may return before S3 upload completes or database transaction fully commits - Frontend then fetches unclassified documents from /doctor_app/unclassified_documents - Document may not be fully available/visible in the query results - On second attempt, the async operations have completed, so it works 4. HAR File Evidence: - First HAR: Only v1 GET calls, failures - Second HAR: Shows POST to v2, then successful retrievals Key Files Involved - Frontend upload: /Users/USER/workspace/doctolib/doctolib/app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx (line 175, uses documentsCreate) - Document creation hook: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/assets/javascripts/queries/documents/doctor_app/documents.ts (lines 443-598, useDocumentsCreate) - Endpoint builder: /Users/USER/workspace/doctolib/doctolib/packages/@doctolib/documents-ui/src/helpers/buildDocumentEndpoint.ts (lines 29-36) - Feature switch config: /Users/USER/workspace/doctolib/doctolib/packages/@doctolib/documents-ui/src/constants.ts (lines 4-7) - Unclassified docs fetch: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/assets/javascripts/queries/documents/doctor_app/unclassifiedDocuments.ts (lines 30-123) - V2 Controller: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/controllers/documents/doctor_app/v2/documents_controller.rb - V3 Service: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/document_service.rb - V3 S3 Upload: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/s3_upload_service.rb Requirements for Solution 1. Must ensure document is fully available before returning response to frontend 2. Should maintain double-write compatibility 3. Should not break existing v1 behavior 4. Must be testable on staging (not just locally) 5. Should handle both S3 upload completion and database transaction completion Design Constraints - Cannot test locally (S3 bucket not configured) - Must only use Chrome DevTools for debugging on staging - Solution must work with the existing atomic upload pattern Please design a detailed solution considering: - Where to make the changes (backend vs frontend vs both) - How to ensure atomicity and consistency - How to handle potential S3 upload delays - Testing strategy for staging environment - Rollback plan if issues arise Turn 1 — Assistant I'll explore the codebase to understand the document upload and retrieval flow, then design a comprehensive solution. Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/controllers/documents/doctor_app/v2/documents_controller.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→# V2 Documents REST Controller - Drop-in replacement for V1 4→# 5→# Architecture: 6→# - Thin wrapper around Documents::V3::DocumentService 7→# - Inherits from V1 controller to reuse all helper methods 8→# - Converts ActionDispatch::UploadedFile to hash for service 9→# - Returns V2 JSON response (without Tanker encryption details) 10→# 11→# Key differences from V1 (internal only): 12→# - Uses Documents::V3::D …(truncated, 15746 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/document_service.rb Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/s3_upload_service.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→module Documents 4→ module V3 5→ class S3UploadService 6→ # Uploads file to SSE bucket (documents_server_encrypted) 7→ # 8→ # @param key [String] S3 object key 9→ # @param file [Hash] File hash with :content (String or IO), :content_type, :size 10→ # Note: :content_type and :size are not currently used but passed for consistency 11→ # @return [void] 12→ # @raise [Erro …(truncated, 4188 more chars) Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→module Documents 4→ module V3 5→ # V3 DocumentService: Refactored document service addressing SOLID violations 6→ # 7→ # Key differences from V2: 8→ # - Pure orchestration (no embedded business logic) 9→ # - Tanker encryption settings centralized in TankerEncryptionContext 10→ # - Delegates encryption to MetadataEncryptionService and FileEncryptionService 11→ # - Delegates persistence to Do …(truncated, 36033 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/assets/javascripts/queries/documents/doctor_app/documents.ts (443–603) Tool results: → result (ok): 443→export const useDocumentsCreate = () => { 444→ const buildEncryptedFormData = useEncryptedDocumentFormDataBuilder() 445→ const queryClient = useQueryClient() 446→ const invalidatePatientDocuments = useInvalidatePatientDocuments() 447→ const invalidateAppointmentDocuments = useInvalidateAppointmentDocuments() 448→ 449→ return useMutation({ 450→ mutationFn: async (variables) => { 451→ let body 452 …(truncated, 6323 more chars) Now let me read the unclassified documents query to understand how documents are fetched: Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/assets/javascripts/queries/documents/doctor_app/unclassifiedDocuments.ts (30–130) Tool results: → result (ok): 30→export const useUnclassifiedDocumentsIndex = ( 31→ { enabled, onSuccess, select }: UnclassifiedDocumentsIndexProps = { 32→ enabled: true, 33→ onSuccess: undefined, 34→ select: undefined, 35→ } 36→): UseQueryResult => { 37→ const httpClient = useContext(HttpClientContext) 38→ const toast = useToast() 39→ const { decryptRecord } = useContext(TankerContext) 40→ 41→ return useQuer …(truncated, 4229 more chars) Tools used: - Glob: /documents/v2/atomic_upload_service.rb Tool results: → result (ok): No files found Tools used: - Grep: DocumentUpload Tool results: → result (ok): No files found Tools used: - Grep: class DocumentUpload Tool results: → result (ok): No files found Tools used: - Bash: find /Users/USER/workspace/doctolib/doctolib/engines/documents -type f -name "unclassified_documentscontroller.rb" Tool results: → result (ok): /Users/USER/workspace/doctolib/doctolib/engines/documents/app/controllers/documents/doctor_app/unclassified_documents_controller.rb Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/controllers/documents/doctor_app/unclassified_documents_controller.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→class Documents::DoctorApp::UnclassifiedDocumentsController < Documents::DoctorApp::ApplicationController 4→ after_action :verify_policy_scoped, only: [:index] 5→ after_action :verify_authorized, except: [:index] 6→ 7→ PAGINATE_PER = 500 8→ 9→ def index 10→ page = permitted_params_for_index[:page] || 1 11→ 12→ base_columns = %i[id content_type created_at page_count size] 13→ 14→ document …(truncated, 3577 more chars) Tools used: - Grep: document_upload in /Users/USER/workspace/doctolib/doctolib/engines/documents Tool results: → result (ok): Found 30 files /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document_access.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/test/integration/documents/doctor_app/mobile/upload/upload_test.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/document_service.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/test/integration/documents/doctor_app/mobile/upload/documents_upload_double_write_test.rb /Users/USE …(truncated, 3280 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document_upload.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→class Documents::DocumentUpload < Documents::Db1Record 4→ # rubocop:disable Rails/TableNameAssignment 5→ self.table_name = 'document_uploads' 6→ # rubocop:enable Rails/TableNameAssignment 7→ 8→ data_taxonomy do 9→ description 'Saga state table for tracking document upload consistency between S3 and database' 10→ default secret: false, 11→ system_id: false, 12→ health_informati …(truncated, 1666 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/document_persistence_service.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→module Documents 4→ module V3 5→ class DocumentPersistenceService 6→ # Persists DB1 records (document, accesses, server_encrypted_data) 7→ # NOTE: Caller is responsible for wrapping in transaction 8→ # DocumentFilename (DB4) must be persisted separately via persist_document_filename 9→ # 10→ # @param document [Documents::Document] Document to persist 11→ # @param accesses [Array …(truncated, 11537 more chars) Now let me explore the unclassified documents query policy to understand visibility: Tools used: - Grep: class.UnclassifiedDocumentPolicy Tool results: → result (ok): No files found Tools used: - Bash: find /Users/USER/workspace/doctolib/doctolib/engines/documents -type f -path "/policies/unclassified" Tool results: → result (ok): /Users/USER/workspace/doctolib/doctolib/engines/documents/app/policies/documents/doctor_app/documents/unclassified_document_policy.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/test/policies/documents/doctor_app/documents/unclassified_document_policy_test.rb Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/policies/documents/doctor_app/documents/unclassified_document_policy.rb Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→class Documents::DoctorApp::Documents::UnclassifiedDocumentPolicy < Documents::DoctorApp::ApplicationPolicy 4→ class Scope < ApplicationScope 5→ include DoctorAuthorizations 6→ 7→ def resolve 8→ scope 9→ .not_trashed 10→ .where(kind: nil) 11→ .uploaded_by(current_account) 12→ .left_outer_joins(:accesses) 13→ .where(accesses: { document_id: nil }) 14→ …(truncated, 374 more chars) Tools used: - Grep: def unclassified in /Users/USER/workspace/doctolib/doctolib/engines/documents Tool results: → result (ok): Found 1 file /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document.rb Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document.rb (1–151) Tool results: → result (ok): 1→# frozen_string_literal: true 2→ 3→class Documents::Document < Documents::Db1Record 4→ include ActiveRecordCouchbaseRelations 5→ include SyncHelper 6→ include DataRetention::Retainable 7→ 8→ data_taxonomy do 9→ description 'Not set by the table owner' 10→ default secret: false, 11→ system_id: false, 12→ health_information_likelihood: :none, 13→ data_subject_identity_likelihood: :none, 14→ …(truncated, 6332 more chars) Tools used: - Grep: def unclassified\? in /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document.rb Tool results: → result (ok): 663: def unclassified? Tools used: - Bash: cd /Users/USER/workspace/doctolib/doctolib && git log --all --oneline --grep="upload_document_write" -20 Tool results: → result (ok): 7547d84e7dda test(TIGAH-2447): Add E2E test for featureSwitchOverride during document classification f9f96c0e1e27 Merge branch 'master' of https://github.com/doctolib/doctolib into TIGAH-2492_mobile_upload_document_write_feature_switch 08dd47da363d fix(TIGAH-2492): Pass featureSwitchOverride to document PUT endpoint 1d6371d799ba Merge remote-tracking branch 'origin/master' into TIGAH-2492_mobile_upload_document_write_feature_switch 8207d79867c6 Merge remote-tracking branch 'origin/master' into T …(truncated, 320 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document.rb (660–670) Tool results: → result (ok): 660→ ::Doctolib::O11y.tracer.add_attributes(document_trashed: CurrentMetrics.deleted_documents_count) 661→ end 662→ 663→ def unclassified? 664→ kind.nil? && accesses.empty? 665→ end 666→ 667→ def imported? 668→ ehr_import_id.present? 669→ end Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the …(truncated, 122 more chars) Tools used: - Bash: cd /Users/USER/workspace/doctolib/doctolib && git show b4476f586848 --stat Tool results: → result (ok): commit b4476f58684842ddbe8e95084f884e7bdb0e08e0 Author: Clerton Araujo <> Date: Mon Jan 19 14:25:07 2026 +0100 feat(TIGAH-2492): support featureSwitchOverride in document upload API endpoint selection This fix enables the upload_document_write feature switch to correctly route document uploads to the V2 API endpoint when using featureSwitchOverride. Changes: - Updated buildDocumentEndpoint() to accept featureSwitchOverride parameter - Added featureS …(truncated, 1287 more chars) Now let me understand the issue better by examining database transaction flow and checking for potential race conditions: Tools used: - Grep: uploaded_by in /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document.rb Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/packages/@doctolib/documents-ui/src/helpers/buildDocumentEndpoint.ts Tool results: → result (ok): 1→import { isFeatureEnabled } from '@doctolib/variation_management' 2→import type { FeatureSwitchOverride } from './types' 3→ 4→// Helper function to determine if V2 API should be used 5→export function shouldUseV2Api(featureSwitchOrOverride?: string | FeatureSwitchOverride): boolean { 6→ // Handle FeatureSwitchOverride object 7→ if (typeof featureSwitchOrOverride === 'object' && featureSwitchOrOverride !== null) { 8→ const override = featureSwitchOrOverri …(truncated, 1654 more chars) Tool results: → result (ok): 219: scope :uploaded_by, ->(account) { where(account: account) } Now I have a comprehensive understanding of the issue. Let me check the database transaction isolation level and read committed behavior: Tools used: - Grep: Db1Record in /Users/USER/workspace/doctolib/doctolib/engines/documents Tool results: → result (ok): Found 5 files limit: 5, offset: 0 /Users/USER/workspace/doctolib/doctolib/engines/documents/test/services/documents/v3/ehr_import_v2/two_phase_bulk_processor_test.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/ehr_import_v2/two_phase_bulk_processor.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/ehr_import_pending_document.rb /Users/USER/workspace/doctolib/doctolib/engines/documents/app/models/documents/document_access.rb …(truncated, 103 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx (170–190) Tool results: → result (ok): 170→ const isUploadQueueBusy = uploadQueue.length > 0 171→ const [processing, setProcessing] = useState(false) 172→ const uploading = isUploadQueueBusy || processing 173→ const { initializeOCR, ocrProcessing, getOcrFileDetails, persistOCRResult } = useOCR() 174→ const invalidateUnclassifiedDocuments = useInvalidateUnclassifiedDocumentsIndex() 175→ const { mutateAsync: documentsCreate } = useDocumentsCreate() 176→ const { processLabReport } = useProcessLabReport() 177→ …(truncated, 1020 more chars) Now let me trace how the invalidation flow works to understand the timing: Tools used: - Bash: cd /Users/USER/workspace/doctolib/doctolib && grep -n "invalidateUnclassifiedDocuments" app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx Tool results: → result (ok): 174: const invalidateUnclassifiedDocuments = useInvalidateUnclassifiedDocumentsIndex() 396: void invalidateUnclassifiedDocuments() 409: invalidateUnclassifiedDocuments, Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx (200–300) Tool results: → result (ok): 200→ ) => { 201→ // eslint-disable-next-line promise/catch-or-return 202→ persistOCRResult({ 203→ ocr_result: 'ocr_result' in document ? document.ocr_result : undefined, 204→ documentId: document.id, 205→ shareWith: [{ self: true }], 206→ featureName: 'persistOCRResultWithTracking', 207→ featureSwitchOverride: UPLOAD_DOCUMENT_FEATURE_SWITCH_OVERRIDE, 208→ }).then(({ tanker_encrypted_ocr_result }) => { 209→ set …(truncated, 4744 more chars) Tools used: - Read: /Users/USER/workspace/doctolib/doctolib/app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx (300–420) Tool results: → result (ok): 300→ ) => { 301→ queryClient.cancelQueries({ queryKey: unclassifiedDocumentsIndexQueryKey }) 302→ const { file, document, parentDocumentIds } = acceptedFile 303→ const { localFileId, filename, page_count } = document 304→ const timestampBeforeCall = new Date().getTime() 305→ 306→ const shareWith = [{ self: true }] 307→ 308→ const builder = buildEncryptedFormData({ 309→ file, 310→ baseFormBody: { 311→ filenam …(truncated, 4947 more chars) Perfect! Now I have a complete understanding of the issue. Let me create a comprehensive solution plan: Based on my codebase exploration, I now understand the complete picture of the race condition. The issue is clear: 1. Upload flow: POST to v2 endpoint → S3 upload (async) → DB transaction commit 2. Query invalidation: Happens immediately after POST returns (line 396) 3. Unclassified query: Fetches from /doctor_app/unclassified_documents which runs the policy scope that requires accesses: { document_id: nil } 4. Race condition: The DB transaction might not be committed yet, OR there's a read-after-write visibility issue in PostgreSQL The problem is that invalidateUnclassifiedDocuments() is called immediately after documentsCreate returns (line 396), but the database transaction in Documents::V3::DocumentService.persist_and_commit might not have been fully committed and visible to subsequent reads yet. Now let me write a comprehensive implementation plan: Tools used: - Bash: cat > /Users/USER/.claude/plans/composed-crunching-lynx-agent-aaf1eb3.md << 'PLAN_EOF' … Tool results: → result (ERROR): Permission for this tool use was denied. The tool use was rejected (eg. if it was a file edit, the new_string was NOT written to the file). Try a different approach or report the limitation to complete your task. I apologize for the confusion. Since this is a read-only planning task, I cannot write to the plan file. Let me provide my comprehensive solution design directly: Solution Design: Fix Race Condition in V2 Document Upload Visibility Executive Summary The issue is a database transaction visibility race condition. When documents are uploaded via the V2 endpoint with upload_document_write feature switch enabled, the HTTP response is returned before the database transaction is fully committed and visible to subsequent queries. This causes the first fetch of unclassified documents to fail, while the second attempt succeeds. Root Cause Analysis Current Flow 1. Frontend: POST /doctor_app/v2/documents 2. Backend: Documents::V3::DocumentService.create_document executes 2-phase commit: - Phase 1: Create DocumentUpload marker - Phase 2: Upload to S3 (SSE + legacy buckets) - synchronous but network I/O - Phase 3a: DB1 transaction - persist document + accesses + metadata - Phase 3b: DB4 - persist DocumentFilename 3. Controller returns response IMMEDIATELY after Phase 3b 4. Frontend: Calls invalidateUnclassifiedDocuments() (line 396 in DocumentImportContext.tsx) 5. Frontend: React Query refetches /doctor_app/unclassified_documents 6. Backend query uses policy scope: kind: nil AND accesses: { document_id: nil } The Race Condition Problem: PostgreSQL READ COMMITTED isolation level doesn't guarantee read-after-write visibility across different database connections. Why it happens: - The document creation transaction might be in "commit pending" state - The subsequent GET request uses a different connection from the pool - Network latency between Rails and database delays visibility - No explicit transaction flush before response Why second attempt works: Sufficient time elapsed for transaction commit + visibility propagation. Recommended Solution: Backend Transaction Flush + Visibility Verification Primary Implementation: Explicit Visibility Check in V2 Controller Location: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/controllers/documents/doctor_app/v2/documents_controller.rb Implementation: Add this method to the controller: Ensures document is committed and readable before returning response Prevents race conditions where subsequent queries don't see the document def ensure_document_visibility(document) # Force Rails to flush any pending database writes ActiveRecord::Base.connection.verify! # Verify document is visible in a fresh query (read-your-writes check) max_retries = 3 retry_delay_ms = 50 max_retries.times do |attempt| # Fresh query to verify visibility across connection pool # Use unscoped to bypass default scopes visible = Documents::Document.unscoped.exists?(id: document.id) if visible # Track successful visibility on first try vs retry MetricsService.track_visibility_check( source: self.class.name, attempt: attempt + 1, success: true ) Doctolib::O11y.logger.debug( 'Document visibility confirmed', document_id: document.id, attempt: attempt + 1 ) if attempt > 0 # Only log if retry was needed return true end # Document not visible yet, short retry sleep(retry_delay_ms / 1000.0) unless attempt == max_retries - 1 end # If we get here, visibility check failed - log error but don't fail request # The document WAS created, just visibility timing is unusual MetricsService.track_visibility_check( source: self.class.name, attempt: max_retries, success: false ) Doctolib::O11y.logger.error( 'Document created but visibility check failed', document_id: document.id, max_retries: max_retries, document_key: document.key ) # Don't fail the request - document exists, just timing issue true end Modify the create method: def create file = extract_file_from_params tanker_encrypted_file = extract_tanker_encrypted_file return handle_validation_error(:render_missing_file_error) if file.blank? return handle_validation_error(:render_corrupted_file_error) unless validate_file(file) return handle_validation_error(:render_corrupted_file_error) unless validate_file(tanker_encrypted_file) return handle_validation_error(:render_title_validation_error) if title_validation_failed? @document = create_document_via_service(file, tanker_encrypted_file) # NEW: Ensure transaction is committed and document is visible ensure_document_visibility(@document) handle_post_creation_response render formats: :json rescue ArgumentError => error handle_create_error(error, :bad_request) rescue # ... rest of existing error handling ... end Why this works: - Forces verification that document is visible across connection pool - Retry logic handles typical commit propagation delays (50-150ms) - Non-blocking: max 150ms additional latency (3 × 50ms) - Doesn't fail requests even if check times out (document exists) - Provides metrics for monitoring Alternative: PostgreSQL Advisory Lock (More Robust) If connection flush isn't sufficient, use database-level synchronization: def ensure_document_visibility(document) # Use PostgreSQL advisory lock for cross-session synchronization lock_key = Zlib.crc32("document_#{document.id}") # Convert to integer Documents::Document.connection.execute( "SELECT pg_advisory_lock(#{lock_key})" ) begin # Verify document exists visible = Documents::Document.unscoped.exists?(id: document.id) unless visible Doctolib::O11y.logger.error( 'Document not visible after advisory lock', document_id: document.id ) end visible ensure # Always release lock Documents::Document.connection.execute( "SELECT pg_advisory_unlock(#{lock_key})" ) end end Trade-offs: - ✅ Stronger synchronization guarantee - ✅ Database-native solution - ⚠️ Slightly higher overhead (advisory lock acquisition) - ⚠️ Potential for lock contention under high load Secondary: Add MetricsService Tracking Location: /Users/USER/workspace/doctolib/doctolib/engines/documents/app/services/documents/v3/metrics_service.rb Add method to track visibility checks: def self.track_visibility_check(source:, attempt:, success:) tags = { source: source, attempt: attempt, success: success } Doctolib::O11y.statsd.increment('documents.v2.visibility_check', tags: tags) # Track timing distribution if attempt > 1 Doctolib::O11y.statsd.increment('documents.v2.visibility_retry', tags: tags) end end Tertiary: Frontend Defensive Enhancement (Optional) Location: /Users/USER/workspace/doctolib/doctolib/app/assets/javascripts/contexts/doctor/desktop/DocumentImportContext.tsx Around line 396, add small delay as defense-in-depth: const { document: uploadedDocument } = responseBody as { document: DocumentCreateResponse } // ... existing tracking code ... // Defensive: Small delay to ensure cross-geography visibility // Backend should handle this, but adds resilience for edge cases await new Promise(resolve => setTimeout(resolve, 100)) // 100ms buffer void invalidateUnclassifiedDocuments() Rationale: - Belt-and-suspenders approach - Handles edge cases in distributed staging environment - User doesn't perceive 100ms delay (part of upload experience) - Non-breaking if backend fix has issues Testing Strategy for Staging Test Case 1: Basic Upload Visibility (Primary Test) Setup: - Enable upload_document_write feature switch - Clear browser cache - Open Chrome DevTools Network tab Steps: 1. Navigate to "dokumente anzeigen" 2. Upload a single PDF document 3. Observe network requests in DevTools 4. Verify document appears in unclassified list IMMEDIATELY …(truncated, 5203 more chars)