Wednesday January 27, 2021
1:00 PM UTC

Session Leads: Justus Ortlepp (@justusortlepp_2), Jason Darmanovich, Greg McCormick (@gm-sybrin)

Session Description: We will be presenting our report on the Proof of Concept for the implementation of a Fraud Risk Management solution based exclusively on Open Source Software components, as well as a demonstration of the solution to evaluate incoming transfers against implemented typologies.

Hi Simeon,

Jason Darmanovich has replaced Tom.


Made the change @gm-sybrin. Thanks.

This is looking good! Would love to understand the cost of the environment running the tests - 10K TPS on a laptop is more impressive than 10K TPS on 100 high end servers.

@RobReeve, do you have any data on anonymisation failures? I’d be interested to look at it

They’re getting that for you Matt. Want to give you the compute power of the nodes, not just number of nodes.

Possibly also trying to establish cost.

Hi @MBohan

The base Kubernetes cluster is comprised out of:

  • 10 x Nodes Standard B12_MS (12 vCPU, 48GB RAM)
  • 10 x Nodes Standard E4S_V3 (4 vCPU, 32GB RAM)
  • 2 x Nodes Standard DS2_V2 (2 vCPU, 7GB RAM)

The B12_MS nodes were used for targeting by NiFi as it’s dedicated node pool

Logically layered out as follows:

  • 24 x NIFI @ 2CPU 4GB RAM
  • 6 x ElasticSearch @ 2CPU 4GB RAM
  • 4 x Kafka @ 4CPU 4GB RAM
  • 4 x Redis @ 2CPU 4GB RAM
  • For each typology we ran 3 replicas @ 1CPU 1GB RAM

A quick google will find you lots of examples of data anonymisation issues
@MichaelRichards but I have been trying to find the fun one - which may be an urban myth - of a release of an anonymised data set by uber that showed that a given individual (tracked by his house being isolated) was visiting someone other than his wife.

I will keep searching…

Updated - this Forbes article also explains how quickly it can be reversed.