Data Cleansing

Break tests are applied to identify potential data outliers in our data anomaly detection process. You will be able to define a variety of bespoke break tests, with the relevant thresholds and applicability (e.g. asset class, currency, trade type), and potential tenor overrides (i.e. different thresholds). We also have the concept of a ‘child test’ which can be set up to not trigger breaks on a standalone basis (e.g. day-on-day move), but only under certain conditions, when say the magnitude is different from one source to another.

The types of break tests in Xplain are:

market data break tests
valuation data break tests, for which you will need to specify the currency in which third-party valuation data is reported

We also have a similar concept of break test for new trade onboarding, where a report is generated to identify potential trade misbooking (without a two-stage resolution process).

1. Break Tests for Market Data

Break tests for market data can be classified into two categories which will be applied in the following order:

preliminary break tests, to assist in identifying outliers on a standalone basis
overlay break tests, applied on preliminary cleaned data, to assist in identifying outliers by comparing two sets of data (day-on-day, source-to-source, or day-on-day by reference to historical (joint) distribution, e.g. ‘Z-score’ test)

Preliminary break tests include a ‘NULL’ test which you cannot disable. They can be applied on a single curve date or on a set of curve dates (which can be useful for batch cleaning of historical data).

Overlay break tests can be defined with up to three different break thresholds (to prioritise the severity level).

Market Data Break Test Definitions
Break test definitions for market data are described in the tables below, with the following conventions:

P(t,i), i = 1 (primary provider) or 2 (secondary provider)
ΔP(t,i) = P(t,i) - P(t-1,i)
For day-on-day overlay break tests, the waterfall to determine P(t-1,i) will be: overlay value -> preliminary value -> raw value
For source-to-source overlay break tests, the waterfall to determine P(t,i) will be: preliminary value -> raw value

If an overlay break test calculation can not be performed due to a missing raw market data value, this will not trigger a break however, the number of “skipped” tests will be reported in the dashboard.

Overlay day-on-day break tests will only be applied to data in respect of the primary provider, unless used as a child break test (see below).

TEST DEFINITION	TEST VALUE	ADDITIONAL INPUTS ⁽¹⁾
NULL ⁽²⁾	P(t,i) = NULL
Value	Abs(P(t,i))	Threshold/Factor i
Zero	Abs(P(t,1) = 0	Providers
Stale ⁽³⁾	P(t,i) = P(t-x,i) = … = P(t-n+1,i), x = 1 to n-1	Observation Period (# Day)
Day-on-day Sign ⁽³⁾	Sign(P(t,i)) <> Sign(P(t-1,i))

⁽¹⁾ See MD break test attributes
⁽²⁾ Always applicable on all providers and all asset classes (i.e. it cannot be disabled)
⁽³⁾ Not applicable as a preliminary (batch) break test

TEST DEFINITION	SCALING	TEST VALUE	ADDITIONAL INPUTS ⁽¹⁾
Day-on-day	Absolute Difference	Abs(ΔP(t,i))	Threshold/Factor i Operator
Day-on-day	Relative Difference	Abs(ΔP(t,i) / P(t-1,i))	Threshold/Factor i Operator
Value	Z-score	Abs[(ΔP(t,i) - Mean(t-1,i))/Stdev(t-1,i)]	Z-Score Observation Period Threshold/Factor i Operator
Value	Conditional Z-score	Abs[(ΔP(t,i) - ConditionalMean(t-1,i))/ ConditionalStdev(t-1,i)]	Z-Score Observation Period Threshold/Factor i Operator
Primary vs Secondary Provider	Absolute Difference	Abs[P(t,2) - P(t,1)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽²⁾
Primary vs Secondary Provider	Relative Difference	Abs[(P(t,2) - P(t,1))/P(t,1)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽²⁾

⁽¹⁾ See MD break test attributes
⁽²⁾ Where applicable, P(t,i) will represent the calculation result of the selected child break test for the relevant provider

In our sandbox environment, in addition to the NULL test that cannot be disabled, we have predefined the following tests:

A Quantum preliminary break test, which triggers a break when the data value is above 100
A EUR IRS Source to Source overlay break test, which compares the market data provided by the primary and the secondary providers for EUR swap rates. It triggers a break when the relative difference is above 5%. It also contains an override for tenors shorter than 2Y, for which a more lenient threshold of 7% is applied.

The example below will guide you through re-defining the EUR IRS Source to Source test. To do so, you will have to archive it first by clicking on .

Under Preferences/Data Cleansing/Break Test Definitions/Market Data, you can create a break test by clicking on Add New (or edit an existing one by double-clicking on the line item).

Alt for image — Creating a market data break test
Preferences/Data Cleansing/Break Test Definitions/Market Data

Field Name	Description	Permissible Values
Test Type	The type of break test	Preliminary \| Preliminary (Batch) \| Overlay
Test Definition	The break test measure definition	See MD break test definitions
Child Break Test	The test value used as underlying measure	N/A \| An existing MD overlay break test
Break Test Name	The name of the break test	Free text
Asset Classes	Test coverage per asset class / instrument type	RATES \| Rates instrument type(s) (e.g. "IR Rate") CREDIT \| Credit instrument type(s) (e.g. "CDS") FX \| FX instrument type(s) (e.g. "FX Spot") TRS \| TRS instrument type(s) (e.g. "BOND")
Rates Currencies ⁽¹⁾	Currency granularity for Asset Classes = "RATES" or any sub-category	A permissible currency
IR Instruments ⁽¹⁾	IR Instrument granularity for Asset Classes = "RATES" or "RATES/IR Rates"	See IR instruments
Credit Sectors ⁽¹⁾	Sector granularity for Asset Classes = "CREDIT" or any sub-category	See permissible credit sectors
FX Ccy Pairs ⁽¹⁾	FX pair granularity for Asset Classes = "FX" or any sub-category	See fx rate rule
Scaling ⁽²⁾	The applicable scaling	See MD break test definitions
Z-Score Observation Period ⁽²⁾	Number of years in the observation period	1Y to 5Y
Threshold/Factor 1 ^{(2) (3)}	Threshold value that will be compared to the test value	Numeric (positive)
Threshold/Factor 2 ^{(2) (3)}	Second threshold value for escalation purposes	Numeric (positive) (optional)
Threshold/Factor 3 ^{(2) (3)}	Third threshold value for escalation purposes	Numeric (positive) (optional)
Operator ⁽²⁾	Operator to apply between the test calculation result and the Threshold	> \| >=
Observation Period (# Day) ⁽²⁾	Number of historical data in the observation period	Integer, n > 1

⁽¹⁾ Only applicable if the relevant Asset Class is selected
⁽²⁾ Where applicable. See MD break test definitions.
⁽³⁾ Threshold 2 and Threshold 3 have to be in increasing order. The greater threshold triggering a break will be the one reported in the XM workflow.

2. Break Tests for Valuation Data

Prior to running a valuation data anomaly detection process, you will first need to define the currency in which third-party valuation data is reported (i.e. trade currency or reporting currency).

Break tests for valuation data can be split into two categories which will be applied in the following order:

Overlay I break tests, where the base value will be that of the primary provider (e.g. day-on-day)
Overlay II break tests, where the base value will be that of the Overlay I cleaned data (e.g. source-to-source)

Overlay I break tests include a ‘NULL’ test which you cannot disable.

Overlay break tests can be defined with up to three different break thresholds (to prioritise the severity level), and the difference in data can be scaled in various ways prior to being compared (e.g. relative value, notional, NAV, DV01).

When performing valuation data break resolution and approval, the order of the break test results in the resolution / approval table will reflect the row ordering of your break test definition under Preferences/Data Cleansing/Break Test Definitions/Valuation Data.

Valuation Data Break Test Definitions
Break test definitions and applicable scalings for valuation data are described in the table below, with the following conventions:

PV(t,i), i = 1, 2, 3 or 4 (Base, Secondary, Tertiary or Quaternary Provider) (where applicable)
ΔPV(t,i) = P(t,i) - P(t-1,i)
For source-to-source break tests, the waterfall to determine P(t,i) will be: Overlay I value -> raw value
For day-on-day break tests, t-1 will be defined as the latest date (prior to the current date) on which a VD XM dashboard was run for the pricing slot and portfolio(s) in scope. P(t-1,i) will be the final verified value when such dashboard was completed.

If an overlay break test calculation can not be performed due to a missing raw valuation data value, this will not trigger a break however, the number of “skipped” tests will be reported in the dashboard (including the case when a VD XM dashboard is run for the first time and where no day-on-day break test calculation is possible, due the absence of a previous day’s verified value).

Overlay break tests will be applied to data in respect of all providers. However, they will only trigger a break in respect of the base value, unless used as a child break test (see below).

The ‘Stale’ break test can only be applied to data in respect of the primary provider (i=1), as set out in the table below.

TEST DEFINITION	SCALING	TEST VALUE	ADDITIONAL INPUTS ⁽¹⁾
NULL ⁽²⁾		PV(t,i) = NULL
Value		Abs(PV(t,i))	Providers Threshold/Factor i Operator
Zero		Abs(PV(t,i)) = 0	Providers
Stale		PV(t,1) = PV(t-x,1) = … = PV(t-n+1,1), x = 1 to n -1	Observation Period (# Day)
Day-on-day Sign		Sign(PV(t,i)) <> Sign(PV(t-1,i))	Providers
Day-on-day	Absolute Difference	Abs(ΔPV(t,i))	Providers Threshold/Factor i Operator
Day-on-day	Relative Difference	Abs[ΔPV(t,i)/PV(t-1,i)]	Providers Threshold/Factor i Operator
Day-on-day	Greeks – 01 ^{(3) (7)}	Abs[ΔPV(t,i)/((01(t,i) + 01(t-1,i))/2)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Day-on-day	Greeks – Vega ⁽⁷⁾	Abs[ΔPV(t,i)/((Vega(t,i) + Vega(t-1,i))/2)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Day-on-day	Greeks - 01 + Vega ^{(3) (7)}	Abs[ΔPV(t,i)/((01(t,i) + 01(t-1,i))/2 + (Vega(t,i) + Vega(t-1,i))/2)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Day-on-day	Greeks - Day-on-Day ^{(3) (4)}	Abs[ΔPV(t,i)/((01(t,i) + 01(t-1,i))/2 * ΔParRate(t,i) + (Vega(t,i) + Vega(t-1,i))/2 * ΔImpliedVol(t,i))]	Providers Threshold/Factor i Operator
Day-on-day	NAV	Abs[ΔPV(t,i)/NAV(t)] * 10,000	Providers Threshold/Factor i Operator
Day-on-day	Notional	Abs[ΔPV(t,i)/Notional(t)] * 10,000	Providers Threshold/Factor i Operator
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Absolute Difference	Abs[PV(t,j) - PV(t,i)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Relative Difference	Abs[(PV(t,j) - PV(t,i))/PV(t,i)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Greeks – 01 ⁽³⁾	Abs[(PV(t,j) - PV(t,i))/01(t,i)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Greeks – Vega	Abs[(PV(t,j) - PV(t,i))/Vega(t,i)]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Greeks - 01 + Vega ⁽³⁾	Abs[PV(t,j) - PV(t,i)]/[Abs(01(t,i)) + Abs(Vega(t,i))]	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	NAV	Abs[(PV(t,j) - PV(t,i))/NAV(t)] * 10,000	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾
Primary vs Secondary Provider Primary vs Tertiary Provider Primary vs Quaternary Provider	Notional	Abs[(PV(t,j) - PV(t,i))/Notional(t)] * 10,000	Threshold/Factor i Operator Child Break Test (Optional) ⁽⁵⁾

⁽¹⁾ See VD break test attributes
⁽²⁾ Always applicable on all providers and all trade types (i.e. it cannot be disabled)
⁽³⁾ With 01 being the most relevant delta according to the product type (e.g. CS01 for CDS)
⁽⁴⁾ Only applicable if valuation data provider = "Xplain"
⁽⁵⁾ Where applicable, P(t,i) will represent the calculation result of the selected child break test for the relevant provider
⁽⁶⁾ Applicable NAV(t) and Notional(t) are valuation data themselves, with 'NAV' and 'NOTIONAL' as respective data provider
⁽⁷⁾ Calculated using the scaled mean of t and t-1 Greeks

In our sandbox environment, in addition to the ‘NULL’ test that cannot be disabled, ^(*) we have predefined the following tests:

a DV01 day-on-day Overlay I break test, for IRS trades, which triggers a break when the change in value is above 10 times the relevant 01 of the trade, with a 5 times override threshold for MXN rates trades ^(**)
a PvS DV01 source-to-source Overlay II break test, which compares the Overlay I cleaned data to the raw valuation data provided by the secondary provider for linear trades. It triggers a break when the difference in value is above 3, 5 and 7 times the relevant 01 of the trade respectively, according to the severity level. ^(**)
a PvS Vega source-to-source Overlay II break test, which compares the Overlay I cleaned data to the raw valuation data provided by the secondary provider for option trades. It triggers a break when the difference in value is above 3, 5 and 7 times the vega of the trade respectively, according to the severity level.
^(*) if the ‘NULL’ break toggle is ‘disabled’, the ‘NULL’ break test will still be run, but the corresponding test result column will be hidden from the display, as ‘NULL’ will be shown as the valuation data itself
^(**) the relevant 01 means DV01 for rates trades, INF01 for inflation trades and CS01 for credit trades

The example below will guide you through re-defining the PvS Vega Test test. To do so, you will have to archive it first by clicking on .

Under Preferences/Data Cleansing/Break Test Definitions/Valuation Data, you can create a break test by clicking on Add New (or edit an existing one by double-clicking on the line item).

Field Name	Description	Permissible Values
Test Type	The type of break test	Overlay I \| Overlay II
Test Definition	The break test measure definition	See VD break test definitions
Child Break Test	The test value used as underlying measure	N/A \| An existing VD break test
Break Test Name	The name of the break test	Free text
Company Entity Portfolio	A list of in-scope companies / entities / portfolios	An existing Company ID / Entity ID / Portfolio ID
Trade Type	Test coverage per asset class / trade type	RATES \| Rates trade type(s) (e.g. "IRS") CREDIT \| Credit trade type(s) (e.g. "CDS") FX \| FX trade type(s) (e.g. "FX Forward") CUSTOM_RATES \| Custom Rates 1 to 5 CUSTOM_FX \| Custom FX 1 to 5 CUSTOM_COMMODITY \| Custom Commodity 1 to 5 CUSTOM_EQUITY \| Custom Equity 1 to 5 CUSTOM_CREDIT \| Custom Credit 1 to 5 CUSTOM_OTHER \| Custom Other 1 to 5
Rates Currencies ⁽¹⁾	Currency granularity for Trade Type = "RATES" or any sub-category	A permissible currency
Credit Sectors ⁽¹⁾	Sector granularity for Trade Type = "CREDIT" or any sub-category	See permissible credit sectors
FX Ccy Pairs ⁽¹⁾	FX pair granularity for Trade Type = "FX" or any sub-category	See fx rate rule
Providers ⁽²⁾	Breaks will only be triggered against P1. For information purposes, you can run the test against other providers, but it will not trigger a break.	Primary (P1) Secondary (P2) - Test calculation only Tertiary (P3) - Test calculation only Quaternary (P4) - Test calculation only
Scaling ⁽²⁾	The applicable scaling	See VD break test definitions
Threshold/Factor 1 ^{(2) (3)}	Threshold value that will be compared to the test value	Numeric (positive)
Threshold/Factor 2 ^{(2) (3)}	Second threshold value for escalation purposes	Numeric (positive) (optional)
Threshold/Factor 3 ^{(2) (3)}	Third threshold value for escalation purposes	Numeric (positive) (optional)
Operator ⁽²⁾	Operator to apply between the test calculation result and the Threshold	> \| >=
Observation Period (# Day) ⁽²⁾	Number of historical data in the observation period	Integer, n > 1

⁽¹⁾ Only applicable if the relevant Trade Type is selected
⁽²⁾ Where applicable. See VD break test definitions.
⁽³⁾ Threshold 2 and Threshold 3 have to be in increasing order. The greater threshold triggering a break will be the one reported in the XM workflow.

Data Cleansing Currency Settings

Prior to running a valuation data anomaly detection process, you will first need to define the currency in which third-party valuation data is reported (i.e. trade currency or reporting currency). This is relevant in particular when Xplain is one of your valuation data providers, to ensure that the correct valuation calculated in Xplain is used during the data cleansing workflow.

Under Preferences/Data Cleansing/Break Test Definitions/Valuation Data, you can edit the data cleansing currency settings by clicking on Edit.

A description of the attributes and corresponding permissible values for data cleansing currency settings are set out in the table below.

Field Name	Description	Permissible Values
Currency Type	Currency of third-party valuation data	TRADE_CCY \| REPORTING_CCY

Introduction to Xplain

Curves

Portfolios

Data

Valuations

Preferences

Admin

Importing and Versioning

XVA Module

TRS Module

Data Cleansing - Break Test Definitions