Tech Stack

  • Framework: Frappe v15 (Python-based ERP framework)
  • Backend: ERPNext integration for Sales Invoices and Payments
  • Database: MariaDB with Frappe ORM
  • Cloud Storage: Google Drive API for document management
  • PDF Processing: pypdf for AcroForm filling, Jinja2 for invoice templates
  • Email: Frappe's built-in email system with HTML templates

The Problem

The client needed a complete management system for their language services business. They provide four distinct service types—Over Phone Interpreting (OPI), Video Remote Interpreting (VRI), Onsite Interpreting (OSI), and Document Translation (DT)—each with different billing structures, workflows, and document requirements. The existing process involved manual tracking, scattered spreadsheets, and no centralized way to handle the complete lifecycle from session scheduling through invoice generation and payment.

Core Architecture

The system is built as a Frappe app with 25+ custom DocTypes. Each service type shares a common workflow state machine:

Incomplete → Scheduled → Active Session → Awaiting Approval → Processed → Invoiced → Paid
                   ↓           ↓
               Delayed    Canceled (Fee/Waived)

Every service document tracks its lifecycle through a workflow_progress child table, creating an immutable audit trail of status changes with timestamps and user attribution.

Access Code System

Instead of complex authentication for clients, the system uses access codes—short alphanumeric identifiers that map to Enterprise Clients. When an interpreter enters an access code, the system fetches all client details including billing rates:

@frappe.whitelist()
def get_access_code_and_client_details(self, access_code):
    """Lookup client info from access code"""
    client = frappe.db.sql("""
        SELECT parent, invoice_description
        FROM `tabEnterprise Clients access codes child`
        WHERE access_code = %s
    """, access_code, as_dict=1)

    if client:
        parent_client = frappe.get_doc("Enterprise Clients", client[0].parent)
        return {
            "parent_client": parent_client.name,
            "client_name": parent_client.agency_name,
            "project_title": client[0].invoice_description
        }

Each client can have multiple access codes for different departments or projects, and rates are automatically pulled based on language proficiency levels (L1-L4).

Google Drive Integration for Document Translation

Document translation projects need organized file storage with different permission levels. When a DT project is created, the system automatically generates a folder structure in Google Drive:

def generate_drive_folders(doc):
    """Create folder hierarchy for DT project"""
    parent_folder_id = frappe.db.get_single_value(
        "Google Integration Settings", "dt_folder_id"
    )
    google_drive, account = get_google_drive_object()

    # Create main project folder
    file_metadata = {
        "name": doc.project_title,
        "mimeType": "application/vnd.google-apps.folder",
        "parents": [parent_folder_id],
    }
    folder = google_drive.files().create(
        body=file_metadata, fields="id"
    ).execute()

    # Create subfolders with specific permissions
    subfolder_config = [
        ("Main", "main_project", "domain"),      # company domain readers only
        ("Vendor", "vendor", "anyone"),          # translator access
        ("Final", "final", "anyone"),            # client delivery
    ]

    for name, field, permission_type in subfolder_config:
        subfolder = google_drive.files().create(...).execute()

        if permission_type == "domain":
            google_drive.permissions().create(
                fileId=subfolder_id,
                body={"type": "domain", "role": "reader", "domain": "company.co"}
            ).execute()
        else:
            google_drive.permissions().create(
                fileId=subfolder_id,
                body={"type": "anyone", "role": "reader"}
            ).execute()

The "Main" folder is restricted to the company domain, while "Vendor" and "Final" folders allow anyone with the link to view—perfect for sharing with external translators and clients without managing individual permissions.

PDF Estimate Generation with AcroForm

Document translation requires formal estimates before work begins. Rather than generating PDFs from scratch, the system fills a pre-designed PDF template with form fields:

from pypdf import PdfReader, PdfWriter
from pypdf.generic import BooleanObject, NameObject

def generate_estimate_pdf(doc):
    """Fill AcroForm fields in the estimate template"""
    template_path = frappe.get_app_path("app_name", "templates", "estimate.pdf")
    reader = PdfReader(template_path, strict=False)
    writer = PdfWriter()

    for page in reader.pages:
        writer.add_page(page)

    # Clone and update AcroForm to force field re-rendering
    if "/AcroForm" in reader.trailer["/Root"]:
        acroform = reader.trailer["/Root"]["/AcroForm"].clone(writer)
        acroform.update({
            NameObject("/NeedAppearances"): BooleanObject(True)
        })
        writer._root_object.update({NameObject("/AcroForm"): acroform})

    # Map document fields to PDF form fields
    field_map = {
        "access_code": doc.estimate_access_code,
        "name": doc.estimate_name,
        "division": doc.estimate_agency_name,
        "words": doc.estimate_words,
        "pages": doc.estimate_pages,
        "estimate": doc.estimate_pdf_estimate,
        "normal_check": "Yes" if doc.estimate_normal_delivery else "Off",
        "rush_check": "Yes" if doc.estimate_rush_delivery else "Off",
    }

    writer.update_page_form_field_values(writer.pages[0], field_map)

    tmp = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False)
    writer.write(tmp)
    return tmp.name

The NeedAppearances flag is crucial—without it, PDF viewers won't re-render the form fields with the new values. The generated estimate is automatically uploaded to the project's Google Drive folder.

Billing System Architecture

The billing system aggregates completed services across all four types, grouped by client. A single billing document can process hundreds of service records:

@frappe.whitelist()
def fetch_documents_for_billing(doc_name):
    """Aggregate processed services by client and type"""
    billing_doc = frappe.get_doc("Billing", doc_name)

    doc_types = [
        {"doctype": "Over Phone Interpreting", "abbreviation": "OPI"},
        {"doctype": "Video Remote Interpreting", "abbreviation": "VRI"},
        {"doctype": "Onsite Interpreting", "abbreviation": "OI"},
        {"doctype": "Document Translation", "abbreviation": "DT"}
    ]

    grouped_data = {}

    for dt in doc_types:
        filters = [
            ["status", "in", ["Processed", "Canceled - Fee"]],
            ["creation", ">=", start_date],
            ["creation", "<", end_date_inclusive],
        ]

        documents = frappe.get_all(dt["doctype"], filters=filters,
                                    fields=["name", "client_id", "billable_amount"])

        for doc in documents:
            # Group by client and service type
            grouped_data.setdefault(doc.client_id, {
                "document_types": {}
            })

            type_data = grouped_data[doc.client_id]["document_types"].setdefault(
                dt["doctype"], {"count": 0, "total_amount": 0}
            )

            type_data["count"] += 1
            type_data["total_amount"] += doc.billable_amount

    # Populate billing summary child table
    for client_id, data in grouped_data.items():
        for project_type, pd in data["document_types"].items():
            billing_doc.append("bills", {
                "client": client_name,
                "project_type": project_type,
                "occurrence": pd["count"],
                "billing_total": pd["total_amount"]
            })

Invoice Generation Pipeline

Creating invoices is a multi-phase process with real-time progress updates. Each client gets their own Sales Invoice with line items for each service type:

def create_invoices_from_billing(doc_name):
    """Generate invoices with PDFs and service breakdowns"""

    for client_index, (client_name, bills) in enumerate(client_entries.items()):
        # Phase 1: Create Sales Invoice
        frappe.publish_realtime("billing_invoice_progress", {
            "current_client": client_name,
            "phase": "Creating invoice",
            "percent_complete": percent_complete
        })

        invoice_doc = create_combined_sales_invoice(customer_name, bills)

        # Phase 2: Generate invoice PDF
        invoice_pdf_path = generate_invoice_pdf(invoice_doc, client_info)

        # Phase 3: Generate service breakdown PDFs
        breakdown_pdfs = []
        for service_type in ["opi", "vri", "oi", "dt"]:
            pdf_path = generate_service_breakdown_pdf(billing_doc, client_info, service_type)
            breakdown_pdfs.append(pdf_path)

        # Phase 4: Compress into ZIP archive
        zip_result = compress_pdfs_to_archive(invoice_doc, all_pdfs, client_info)

        # Phase 5: Upload to Google Drive
        upload_result = upload_to_google_drive(zip_path, client_info, billing_doc)

        # Phase 6: Update service documents to "Invoiced"
        update_service_documents_status(billing_doc, client_id, service_types)

The publish_realtime calls push progress updates via WebSocket, so the UI shows a live progress bar as each client is processed.

Google Drive Invoice Archives

Invoice packages (main invoice + service breakdowns) are compressed and uploaded to Google Drive with an auto-created folder structure:

def create_folder_structure(google_drive, client_info, billing_doc):
    """Create: Invoice Month / Client Name / files"""
    parent_folder_id = frappe.db.get_single_value(
        "Google Integration Settings", "invoice_folder_id"
    )

    # First level: Invoice period
    invoice_description = billing_doc.invoice_description  # e.g., "Invoice June 2025"
    desc_id = get_or_create_folder(google_drive, invoice_description, parent_folder_id)

    # Second level: Client name
    client_id = get_or_create_folder(google_drive, client_info["agency_name"], desc_id)

    # Store folder URL on billing document
    folder_url = f"https://drive.google.com/drive/folders/{desc_id}"
    billing_doc.db_set('google_drive_url', folder_url)

    return client_id

def get_or_create_folder(google_drive, folder_name, parent_id):
    """Idempotent folder creation"""
    # Check if folder exists first
    query = f"mimeType='application/vnd.google-apps.folder' AND name='{folder_name}' AND '{parent_id}' in parents AND trashed=false"
    results = google_drive.files().list(q=query).execute()

    if results.get('files'):
        return results['files'][0]['id']

    # Create new folder
    file_metadata = {
        'name': folder_name,
        'mimeType': 'application/vnd.google-apps.folder',
        'parents': [parent_id]
    }
    folder = google_drive.files().create(body=file_metadata).execute()
    return folder.get('id')

The idempotent get_or_create_folder function prevents duplicate folders if the billing process is run multiple times.

Mileage Calculation for Onsite Interpreting

Onsite interpreting includes travel expenses. The system tracks round-trip miles and applies configurable mileage rates:

def calculate_billable_amount(self):
    """Calculate total including hourly rate + mileage"""
    # Base: hourly rate × duration (minus lunch)
    duration_hours = (end_time - start_time).total_seconds() / 3600

    if self.start_lunch and self.end_lunch:
        lunch_duration = (end_lunch - start_lunch).total_seconds() / 3600
        duration_hours -= lunch_duration

    billable_amount = duration_hours * self.project_rate

    # Add mileage fee
    if self.miles_round_trip and self.mileage_rate:
        mileage_rate_amount = frappe.get_value(
            "Mileage Rates",
            {"contract_name": self.mileage_rate},
            "rate"
        )
        mileage_fee = float(self.miles_round_trip) * float(mileage_rate_amount)
        billable_amount += mileage_fee

    self.billable_amount = billable_amount

Mileage rates are stored in a lookup table, so when the IRS rate changes (Federal 2025 is $0.70/mile), it's a single update that applies to all future calculations.

Automated Reminder System

The system runs scheduled tasks to send escalating reminders based on session state and age:

scheduler_events = {
    "hourly": [
        "check_opi_interpreting_status_and_send_reminders"
    ],
    "cron": {
        "*/1 * * * *": [  # Every minute
            "update_session_status"
        ]
    }
}

def check_opi_interpreting_status_and_send_reminders():
    """Time-window based reminder escalation"""

    # Active sessions: escalating reminders
    windows = [
        (2, "hours", notify_session_over),       # 2h after start
        (12, "hours", notify_submit_times),      # 12h: submit your times
        (2, "days", notify_late_submission),     # 2d: late warning
        (6, "days", notify_manager_warning),     # 6d: manager escalation
        (7, "days", notify_final_warning),       # 7d: final warning
    ]

    for duration, unit, notification_func in windows:
        threshold = now - timedelta(**{unit: duration})
        sessions = frappe.get_all("Over Phone Interpreting",
            filters=[
                ["status", "=", "Active Session"],
                ["session_time", "<=", threshold],
                ["last_reminder", "!=", duration]  # Don't re-send
            ])

        for session in sessions:
            notification_func(session)

Each service type has its own reminder cadence tuned to typical completion times—phone calls resolve faster than onsite sessions.

Client Rate Structure

Enterprise clients have granular rate configurations across all service types and language proficiency levels:

ServiceL1L2L3L4Unit
Over Phone (OTP)$X.XX$X.XX$X.XX$X.XXper minute
Video Remote (VRI)$X.XX$X.XX$X.XX$X.XXper hour
Onsite (OS)$X.XX$X.XX$X.XX$X.XXper hour

Plus after-hours rates, rush fees, cancellation fees, and mileage rates—all configurable per client. When a session is created with a language, the system automatically looks up the language's proficiency level and fetches the corresponding rate from the client profile.

Invoice Reversals

Sometimes billing needs to be undone. The reversal process handles the complete cleanup:

@frappe.whitelist()
def reverse_invoices_for_billing(doc_name):
    """Complete reversal: invoices, statuses, local files, Drive folder"""

    # 1. Cancel and delete Sales Invoices
    for invoice in invoices:
        invoice_doc.cancel()
        frappe.delete_doc("Sales Invoice", invoice_name, force=True)
        delete_attached_files("Sales Invoice", invoice_name)

    # 2. Revert service documents to "Processed"
    for doc in service_documents:
        service_doc.status = "Processed"
        service_doc.billing_reference = ""
        service_doc.save()

    # 3. Delete Google Drive folder
    delete_billing_folder_from_drive(billing_doc)

    # 4. Clean up local files
    cleanup_local_files_for_billing(doc_name)

    # 5. Reset billing summary flags
    frappe.db.sql("""
        UPDATE `tabBilling Summary`
        SET invoiced = 0, paid = 0
        WHERE parent = %s
    """, doc_name)

The reversal is atomic—if billing needs to be regenerated, everything from the previous attempt is cleaned up first.

Custom Invoice Formatting

Some clients require special "Bill To" formatting on invoices. The Enterprise Clients doctype supports 8 custom lines that override the standard address format:

# Standard billing address
if not client.custom_invoice_format_check:
    bill_to = f"{client.billing_contact_name}\n{client.billing_address}\n{client.city}, {client.state} {client.zip}"

# Custom format (8 configurable lines)
else:
    bill_to = "\n".join([
        client.line_1,  # e.g., "ATTN: Accounts Payable"
        client.line_2,  # e.g., "RE: Contract #12345"
        client.line_3,  # ...
        # ... up to line_8
    ])

This handles government contracts, large enterprises with specific invoice routing requirements, and clients using third-party billing portals.

Session ID Generation

Each session gets a human-readable ID combining date, access code, and record suffix:

def generate_session_id(self):
    """Format: DDMMYY-ACCESSCODE-XX"""
    session_date = getdate(self.session_time)
    day = str(session_date.day).zfill(2)
    month = str(session_date.month).zfill(2)
    year = str(session_date.year)[-2:]

    self.session_id = f"{day}{month}{year}-{self.access_code}-{self.name[-2:]}"
    # Result: "251125-CLIENTCODE-45"

This creates IDs that interpreters can easily communicate over the phone while containing enough information to look up the record quickly.

Results

The system manages the complete language services lifecycle:

  • 25+ custom DocTypes covering all service types and supporting data
  • Automated Google Drive folder creation with role-based permissions
  • PDF estimate generation using fillable forms
  • Multi-phase billing with real-time progress updates
  • Invoice packages (main + breakdowns) compressed and archived to Drive
  • Escalating reminder system with 13 email templates
  • Complete reversal capability for billing corrections
  • Tiered rate structures by language proficiency and service type
  • Mileage tracking and calculation for onsite services

What started as scattered spreadsheets is now a unified system where sessions flow from scheduling through payment with full audit trails, automated notifications, and organized document storage.