
Transform complex data into actionable insights
A production-grade, multi-tenant financial analytics platform built on Databricks Medallion Architecture — transforming raw GL, AR, AP, and budget data into a live 5-dashboard Power BI suite with full Row-Level Security for enterprise clients.
The Problem
The client's financial data lived across 6 isolated CSV exports — general ledger, AR, AP, chart of accounts, budget, and trial balance — with no unified model, no KPIs, and no way to report across departments or fiscal periods reliably.
General Ledger, Accounts Receivable, Accounts Payable, Chart of Accounts, Budget vs Actual, and Trial Balance each existed as separate raw exports with no shared keys or consistent formats.
There was no automated way to calculate Gross Profit, EBITDA, or Net Income. Revenue, COGS, and OpEx lines were unclassified inside a flat general ledger with no P&L mapping.
Null primary keys, duplicate transactions, inconsistent currency codes, mixed date formats, and zero-value GL rows that added noise to every aggregation and financial report.
The platform needed to serve multiple enterprise clients from a single Databricks catalog, with strict data isolation and Row-Level Security so each client only sees their own financials.
Solution Architecture
The platform is built on the Databricks Medallion pattern with a dedicated catalog (dest_intelligence). Each layer adds quality, structure, and business logic before surfacing to Power BI via a full star schema.
Silver Layer
Every financial module goes through its own tailored cleaning logic in PySpark, targeting the specific data quality issues found in each source, before being written to certified Delta Silver tables.
Deduplication on account_code, initcap on type/subtype/name, null guard on PK, boolean cast on is_active and is_summary.
Null posting dates and zero debit+credit rows dropped. Derived: net_amount, fiscal_quarter, fiscal_year_period. DecimalType(18,2) enforced on amounts.
Derived days_to_pay, is_overdue flag, fiscal_quarter. Zero-amount invoices dropped. Aging bucket and payment status standardised to Title Case.
Same pattern as AR: days_to_pay, is_overdue derived. Vendor category, expense category, and cost center trimmed. Zero bills dropped.
Variance amount and variance percent coalesced to 0. Scenario and budget_version standardised. Fiscal quarter derived from period.
Period-level debit/credit summaries validated per account. Orphan account codes cross-checked against cleaned CoA. Balance integrity verified.
Gold Layer
The Gold layer materialises a fully modelled star schema inside Databricks. Dimension tables hold clean master data; fact tables hold pre-aggregated KPIs ready for Power BI DirectQuery or scheduled import.
Power BI Deliverables
Each dashboard targets a specific financial audience and draws from the Gold layer tables via a fully modelled star schema, with 30+ DAX measures covering time intelligence, variance, and liquidity analysis.
Full income statement from Revenue to Net Income with EBITDA, gross margin, and YoY / MoM growth trends.
AR and AP balances, working capital position, and current ratio with period-over-period movement.
DSO, DPO, overdue AR/AP, AR aging buckets, and collection rate efficiency for treasury teams.
Variance analysis by department and account, with departments-over-budget count and budget attainment percentage.
Pre-aggregated KPI cards with RAG status indicators: On Track / Watch / Alert across revenue, AR health, and budget compliance.
Deliverable Preview
The Gold layer connects to Power BI via Databricks Partner Connect, with a recommended daily 6 AM refresh schedule. All KPI cards update automatically from the certified Delta tables.
Results
Technology