Exploring HTTP Pooling Improvements for Instant Payments

By Oleksii Pylypenko
CaaS Tech Lead, Kyriba

Instant payments initiatives have been gaining momentum, with systems under development all over the world. In this series of articles, we will describe key findings regarding possible performance optimizations for instant payments and banking in the form of proof of concepts and measured outcomes. Oleksii Pylypenko, CaaS Tech Lead, explores in this article how Kyriba has improved upon the pooling process for instant payments.

Multi-Tenant Isolation

As an enterprise liquidity network, Kyriba connects banks and fintech providers with corporate clients. At its core, Kyriba provides access to bank information from hundreds of different sources. Based on this data platform, Kyriba enables decision-making for treasurers, which then leads to payment and transaction execution.

In Kyriba, every customer has guaranteed multi-tenant isolation. This requires a more sophisticated way to handle connectivity to the bank as most banks use mTLS.

Why might this become a problem?

In a typical setup, banks connections should have Keystore, Java SSL Context and HTTP Client established per user. In this case, Keystore contains a key that is unique to the user and a certificate to prove mTLS identity. Spawning these objects for each request consumes a lot of resources; measurements on the proof-of-concept code reveal that such code spends 524.55 ± 10.93 ms to execute such a request, while the same request without mTLS takes 301.32 ± 2.31 ms.

Troubleshooting the Problem

Kyriba has explored several ways to address the issue.

The first approach we explored was to use a cache. The idea was to reduce each connection pool to 1-2 connections and maintain a rather large number of such pools. There are numerous disadvantages for such an approach. Cache misses, high usage of resources, complexity of management of cases where a larger number of connections per pool was required, etc. Basically this approach is just masking the problem, not trying to solve it.

The next approach was to use one Apache HTTP Client and spawn SSL context per needed through “socket registry” extension. This almost worked but provided the wrong results when hosts for the established connections were reused due to internal caching. And this case is a typical situation when several users use the same bank.

context.setAttribute("http.socket-factory-registry", (Lookup) s -> {
 if (!s.equals("https")) {
   throw new RuntimeException("unknown schema '" + s + "'");
 }
 return cache.computeIfAbsent(rh.port, (p) -> {
   try {
     var sslContextBuilder = SSLContexts.custom()
             .setProtocol("TLSv1.2");

     sslContextBuilder.loadTrustMaterial(truststore(List.of(rh.serverId)), new TrustSelfSignedStrategy());
     userN = ThreadLocalRandom.current().nextInt(rh.userIds.size());
     Identity oneOfUserIds = rh.userIds.get(userN);
     sslContextBuilder.loadKeyMaterial(keystore(oneOfUserIds), "password".toCharArray());

     SSLContext sslContext = sslContextBuilder.build();

     return new SSLConnectionSocketFactory(sslContext, new NoopHostnameVerifier());
   } catch (Exception ex) {
     throw new RuntimeException("failed to create SSL context", ex);
   }
 });
});

After some time, Kyriba uncovered the last working approach. The idea is to use the library "sslcontext-kickstart". By doing so, Kyriba is able to create a special SSL Context holding a multitude of identities. These can be distinguished and routed by host, using special virtualized hostnames to execute requests. It looks quite simple and works very well.

SSLContext ctx = SSLFactory.builder()
       .withIdentityMaterial("jpmc.jks", "password")
       // replaced by real Keystore with identity in prod code
       .withIdentityMaterial("hsbc.jks", "password")
       .withTrustMaterial("truststore.jks", "password")
       .withIdentityRoute("jpmc", "https://api.jpmc.com/...")
       .withIdentityRoute("hsbc", "https://api.hsbc.com/...")
       .build();

Next, we measured the resulting latency.

According to proof-of-concept implementation, this takes 303.563 ± 1.85 ms. This is a very good result, and close to requests that do not use any mTLS at all. Furthermore, it is clear that such an approach has no moving parts, as most of the structures are statically allocated.

In future articles, we will explore additional ways to optimize instant payments and banking. Future installments will include:

Autoscaling instant payment processors based on request, queue length and throughput.
Leveraging pools of external IPs to scale multi-tenant instant payments.
Leveraging backpressure to schedule requests.
Fair scheduling of instant payments.
Using webhooks for payment status updates.

Documentation

API

Community

Exploring HTTP Pooling Improvements for Instant Payments

Multi-Tenant Isolation

Troubleshooting the Problem