OfferZen ran a hackathon with members of the Programmable Banking Community to see what exciting projects teams could build using the programmable banking tech in a short space of time.
In this demo, Adam Fisher guides us through his team’s Budgie platform, which allows users to securely review transactions, manage budgets, and compare their spend to others in an anonymous transaction pool. The tech the team used includes CDK, AWS Serverless (Lambda and DynamoDB), Node.js, HTML and JavaScript. Check it out here!
Transcript of the demo
Adam Fisher: (00:00)
We're Team Budgie. It was me, Huggs and Imraan, and this was our project for the hackathon that was run at the end of last year. We all aligned on a solution for securely managing and sharing transaction data. This is related to two things: number one, a personal need for being able to review transactions and manage budgets, and we wanted to do something along the lines of Gmail, where you can manage your transactions with multiple labels rather than having to put them in a single category, and are able to review them and query them however you like. The second thing was that there was a challenge made a few months back about having an anonymous transaction pool. We wanted to approach that and to be able to make transactions public in a way that's still safe and secure for those people who are sharing their transactions.
Adam: (01:16)
The other side of that, in particular, was that it would be very interesting to be able to get an idea of what other people are spending. We kind of spoke about demographics, but that's definitely a future feature. But just being able to see what other people are spending is an incredibly informative thing. They see what people are spending or earning, even without knowing who those people are, and not having any way of getting access to their personal information.
In terms of what we used, the actual platform itself. Budgie refers to the platform and this is a cloud-based platform. It's entirely run in AWS Serverless, meaning all Lambda and DynamoDB. There are no running costs unless people are using the platform. And it was all built with CDK. And the interfaces, which are just examples that we untidily hacked together for the hackathon in the tradition of the word hackathon, are made using Node.js and HTML and pretty much vanilla JavaScript.
One point, please if anybody has questions don't hold them to the end. Feel free to interrupt at any time. There was a suggestion to put a quote on here, so I found one that I liked, which was that “Ultimate security is your understanding of reality” [H Stanley Judd] and this relates to Budgie’s ability, in theory, of being able to see what other people are spending and what other people are spending on and how these things relate. But there's also a play on the concept of security, which is very much a priority for a system that does this.
Adam: (03:10)
How it works. The first part which I will demonstrate is “anonymous user” and “account generation”. The idea is that […] when you generate a user, we have absolutely no idea what your email is, what your name is; we don't care. Basically, what we're doing is we're using your email address for two purposes: number one, to confirm that you're a real entity, and number two, if you get an auth key and you get a user id and if you need to reset your auth key, you'll have to provide your email address. It's hashed into the reset key. Without those we have absolutely no way of knowing who those addresses belong to, whose user accounts belong to who etc.
Adam: (04:32)
Then once you've logged in, the idea is here, on this page. Again, is very, very hacky. Huggs's UI for the hackathon was way more pretty, but we felt that this was a better way forward. Again, this is just a sample. Everybody is welcome to build their own interfaces on top of this and everything is open source, so everybody is welcome to spin up their own stacks if they don't trust us with their data.
The idea is that you put in your email address, you initiate a registration, you get emailed a one-time password, you confirm the registration with the one-time password and then you get back a user id and an authentication key and a reset key. And all of those are sent to your email as well, for safekeeping. So, back to the log-in. […] I use Bitwarden. I warmly recommend everybody using Bitwarden.
This is a demo account that I’ve created already, and if I log in, you can see I’ve already uploaded some things. If I log in, then I have a list of accounts. Now these accounts […] I don't know what these are until I’ve configured them and defined them, which I’ll get to in a moment. From these accounts I can now see all of my data, and this transaction date is related to the account ID not to my [inaudible] and the account ID is then related to the user, but these are things that even if I have access to DynamoDB directly I cannot figure out who belongs to what. I would have to do a lot of research and a lot of work. There's nothing open about this. Public transactions if they have been rendered public looked like this.
Sorry I’m getting ahead of myself. […] Anyway, a quick summary of this is that you generate new accounts very simply. It gives you a user ID and attaches it. And there you go. As you can see, this one would be it. Now if I rename this here. [This is] just a guide to the UI. This is only being stored locally so if I move to a different computer, this doesn't give me any information. So “Demo account, no purpose”, and away we go.
Now having said that, this is just an interface to the Budgie platform, which again is cloud-based. What I’ve done is I’ve created a few accounts and I’ve given them purpose. […] Okay, so what we do […] is on your local machine using our script again anything can be used in order to feed the budgie. I’ve added accounts, so this here is a dummy account UUID that I’ve just created – well, theoretically just created – so that's this one. And this is the bank account number. I’ve obviously written a whole lot of dummy account data in my CSV files that we can use. This is going to be that account there and in order to be able to use the feeder […] So here what I do is add the bank account. I’ve added the other ones already and this gives me a file here, which manages which bank, so that's to add an FNB bank account because I’ve recently moved back to FNB, but for Investec there's an “Load Investec Account script”, which once you provide it with your API credentials, it will automatically download your accounts and then you update that information, those Account objects. You update those with your Budgie ID and then it will take care of the rest
We've got “Import FNB”, which is done via CSV files, which I’ve already implemented and then you've got “Import Investec”, which imports […] directly from your Investec account. Now again that's importing to your local machine and then those go into holding files. These are the transactions, and you can then modify these before you upload to Budgie. For Investec, it's actually a lot easier to manage because we hold everything up until the last time that it's been updated. I’m going into details now that are irrelevant. Don't worry about it.
Adam: (10:01)
We have an opportunity here before they're uploaded into Budgie, and then once they're uploaded, I literally type in node upload.js and it does the magic and uploads them. We then have access to the accounts here. Now as you can see, I’ve made three of these accounts public. Because I’ve made these three accounts public, they will show up with their account numbers […] [obscured]. So, these are just unique IDs that are per query, so I will never be able to figure out […] if I’m looking at old transactions or new transactions, I will never be able to figure out whose transactions belong to who. But […] as far as getting a public transaction pool is concerned, I will get a sense of what an account did for a period of time. So, obviously I can take this back, from January, for example, and then I can take out a single account from what's downloaded and look at all of those details. Now as you can see these are all masked. That's not generally very useful, but I do have the ability here to give it masking text, so this is Swimming Medley or the Unknown […], but let's just call this “swimming stuff” and I will update that, and that is the masked text, so the actual description is still masked. I don't have to mask things; so, let me take these two for example and unmask them. Here we go. […] These two will not be masked; this one is still masked but it's masked with my own optional text and if I go back to public transactions here, we can see “swimming stuff” […] Okay, so this is now unmasked, and you can mask and unmask as you like.
The other aspect which we find useful is the labelling. If I label this […] I can give it a new label that says “sports”. This will contain everything that's been labelled that I’ve seen and so I can now label by sports. Let me go back to what's unlabelled, and let's go here and label by health. […] Let me just label these three things with an arbitrary label, just for the sake of the example. And now this is where, for my purposes, the power comes in.
Adam: (13:30)
Here I have a running total of all the transactions that are displayed. This could be for one month, this could be for five, and what I can do is, I can select it by labels. […] The idea was that you could select both sports and the arbitrary label and do a union of the two or an intersection of the two. […] And so that gives you a really good way of breaking down your spend and figuring out where things are going and labelling them in multiple categories etc. Right, so that kind of covers an overview of what there is and what it looks like […].
Adam: (14:46)
One of the challenges we encountered was learning a new UI framework. That was Huggs working with … Huggs, what was it called? Aurelia?
Hagashen Naidu (Huggs): (14:57)
Aurelia. That’s the one.
Adam: (14:58)
Right, so that was that was one of the challenges and working with Typescript was another. A big challenge for us was designing a serverless solution without storing any potentially sensitive information. That took a lot of a lot of thought and fiddling around before we feel like we got that right .Then one of our team members Imraan [Parker] went down due to illness. So, we were a member short for the crunch time for the hackathon, so that was unfortunate.
Adam: (15:39)
In terms of next steps. Not listed here is fixing that labelling bug . We would like to improve the UI; we would like to add a nice dashboard; and add filtering and grouping functions and comparison graphs. In terms of the amount of time that we've invested in this, we really do want you guys or anybody to get involved in two ways. The first way to get involved is the easy way, which is to either upload transactions, and start using this, or to let us know why you have concerns about using this so that we can improve those things (the best options to get in touch with us would be via Slack on the OfferZen Slack channel) – or to create issues. We are actually monitoring them. And now it's time for questions.
Nick Benson: (16:40)
Thank you very much, Adam. Anyone have anything to add or ask? … Adam, well, did you guys have fun at least on this. Clearly there were a lot of challenges and things that you were doing, but did you guys at least have fun supporting it?
Huggs: (17:01)
Yeah, I think it was quite a bit of fun. I think the major difficulty was around, like Adam said, not knowing what was happening with the third member. We spent a couple of days trying to figure out, like, do we do something? Don't we do something? Is this person around? And it was only after quite a while that we actually heard from him. [Inaudible] a man down. At which point it got really fun for Adam and me. We had to really double down, and some of the ways we had split the work was suddenly, “that's not going to work; all right, let's go”. So yeah, there were some fun times, lots of Slack messages, late nights, early mornings, as a hackathon should be, I think.
Adam: (17:47)
Yeah, I learnt a lot in terms of how we structured this project. Everything that I’ve learnt about CDK and working with AWS has come through projects that I’ve done as part of this beta. So, it was very exciting. In terms of getting usage out of it, […] the public transaction pool is an ask from the community, but you know it's not really very effective if nobody's using it. At the same time, for me personally, both with my Investec account and with my FNB accounts, I found it to be a very useful tool. I needed this because there wasn't a tool available that I could trust. […] I lived in Canada for a while and so I’m used to an intuit tool, which kind of gives you all of these tools to manage your budgets and see where your spend is going and connect all of your accounts etc., but it's very limited in in what it offers, especially when it comes to breaking things down that don't have well-defined categories. If I talk about medicine, am I talking about going to a doctor? Am I talking about pharmaceuticals? Am I talking about grocery shopping? Am I talking about health shopping? Am I talking about actual groceries? Am I talking about going and grabbing an ice cream? Where's my fast food versus this versus entertainment? And trying to get a handle on a budget from all of these different angles to see which makes sense. And there aren't any platforms that allow you to do that in, as I said, in the Gmail way of looking up this label and that label but not that label. And so just personally I found an incredible amount of utility out of this. So, this was fantastic. The only thing that I’m concerned about is that I think that there were a lot of concerns, especially I remember Renen raising some concerns about the security of this. It's not fun to work on a project like this alone, in the sense of if there are concerns, I'd like to hear them and talk about them and work through them and it would be nice to get some reviews from the community. Just because I think it's secure doesn't mean I haven't missed something. I’m very confident in its security, but not 100 percent confident and nobody can ever be, so the more eyes we get on this and the more usage we get out of this, I think we could turn this into a really powerful platform. There was a South African company that was plugging into other people's bank accounts, but there was a trust issue in that they were holding my credentials to my bank, and I wanted a way to do this where there's none of that. I don't hold anybody's credentials. Even in this case, my credentials are all completely local, which I think is a … it could be a game changer […] in an environment like this.
Nick: (20:58)
There's a question in the chat from John who's just asking [about anonymising] the transactions and he kind of foresees that it opens up for fake transactions. Are there ways around it? Have you guys thought about that?
Adam: (21:15).
So, one of the problems [is] anybody could just dump noise. The reality is, I mean there might be ways to prevent that from happening. I haven't really figured any out; I don't have any off the top of my head. But this is a collaborative project. Anybody can spin up a stack, all you need is AWS credentials, and you can spin up a stack in five minutes using the CDK project. We should consider ways of doing this, but I also think at the same time what motivation could you have to upload [fake] transactions. I can't see anybody getting real benefit out of doing that. So, if we're using it and we suddenly notice that there's noise, then that's something that should be addressed, absolutely.
Nick: (22:08)
But it's not something that you've thought about to build into a mechanism that can pick up the kind of fake transactions, because you can't tell because it's anonymised.
Adam: (22:18)
Exactly. I mean this is absolutely a valid concern and a risk in terms of how we resolve this. I’m not sure what the right way to go is because we've kind of built in anonymity from the ground up. If you upload fake transactions on a regular basis, I don't know who you are even if I find an account [inaudible], I can't stop you from creating a new account with the same email address. You could create a million accounts; there's no way for me to know who you are.
Huggs: (22:51)
Yes, I think that would probably be somebody who's using the platform for something.[…] The base idea was that we would build a platform and if that was something that was like super-important for your use case that you had to filter out or figure out if people were spamming you with fake transactions or whatever the case is, then that would probably be something that came from that side. Because like Adam said there's no way for us to actually know that, and what value are you going to derive out of spamming and putting in big data. Other than Ben, who likes to mess with work stuff.
[Laughter]
Nick: (23:39)
Yeah, if there’s going to be a problem, it's probably Ben
Ben Blaine: (23:43)
My actual bank transactions are spam, things that I buy is just spam but …
Adam: (23:53)
You mean literally in a can?
Ben: (23:56)
Yes. [Laughter] While I was thinking about that problem, first of all, I think […] I love this solution so much because of the potential of it. I think the really hard part about solutions like this it creates infinite opportunities, and then you're flooded with choice, and then it's like “Argh”, the universe starts flooding into your mind and it's a bit overwhelming. But there's one specific … I thought about something like this before we started the Investec project, when we were thinking about getting into it, which was – and I think in specific use cases often for myself because I struggle with big picture stuff – so one use case was: I wonder how other surfers in Cape Town spend their money and how much I earn compared to them and what they spend it on and, like food, how much do they spend on food? And thinking of it in a surfer’s segment; and then going “okay what [about] other people who work in tech; how does my spending compared to that?” and getting a rating that kind of tells me you're actually doing not too bad in your spending or you're overspending or you have too much money or you're not saving enough compared to people in your segment, which I think is quite interesting.
Ben: (25:16)
And then just looping back to this thing of spam transactions. I love the idea of just having all the transactions in one place – Boom! – and you can use AI or whatever to figure out if there's a malicious transaction [or] fake transaction in there or something. You can maybe otherwise use some kind of verified transaction thing where you do a second factor of check, like when you use your card, and it gives a reference on your bank account and then you give that reference back to PayPal […] or like a verification. […] I mean if this becomes a thing, you could do that, verify it with the source or something if it's true. There are a few ways around that if it becomes a problem or if it becomes something worth solving.
Then there's another one which I thought of which kind of ties into those surfers in Cape Town or something, is you could have transaction pools. I could create the Surfers in Cape Town transaction pool and if your solution creates a nice little landing page for that pool and it guides someone into submitting their transactions and then I can share the password to that pool with specific people, so you only get trusted access to the pool and […] I can share it in a closed Facebook group or something. I’m totally seeing a model here where you can kind of create transaction pools, but obviously what I’m thinking now is we really need to figure out an interesting first use case of this or an interesting first experiment to get transactions into it, get it pulled together. What question can we answer or something as a community? I think there's also interesting stuff where multiple businesses maybe want to share their transactions to do some kind of analytics. That's an interesting use case as well. Maybe there's some reason businesses want to do correlated transactions or something. Yeah, I don't know if anyone's got any thoughts on that or if anyone found this interesting anyone else in the audience maybe…
Nick: (27:24)
I think Devina just raised her hand.
Devina Maharaj: (27:32)
Hi guys, it's Devina. I’m from Investec Bank. I actually look after programmable banking. This was really, really great. So, well done to the team that was presenting. Definitely keen to chat to you guys afterwards. I think there's some interesting use cases here. Just as some context, […] a while back we actually saw a tool, also at one of the conferences in the UK, called People Like Me. It was actually very similar to what Ben was talking about, [it] kind of gave you the ability to compare yourself, like the surfer example, to other people like you and you could basically put in whatever criteria you wanted to. You could say “I’m a 20-year-old tech person or female” or whatever level of demographic. My one question to you guys – and I don't know if I’m just tired today and I missed it in the presentation – so apologies [for] this dummy question, but with the masking and the kind of masking of data, how do you then do the analysis from a trend perspective? Because part of the thing that you guys demonstrated was when you upload your transactions there's the [obscuring] and masking. How would you then analyse it from that perspective? Because I’m thinking, say I’m a business or a bank for instance and we wanted to create a tool like People Like Me, and just actually uploaded all of our clients’ transactions on here – we would never do that, just for the record – but I’m just saying if we wanted to do that in an anonymised way, you'd still need some level of ID and that sort of stuff, [like] a merchant category code. I was just chatting to Wayne while you were demonstrating, but how would you solve that?
Adam: (29:42)
This is very interesting stuff. Firstly, just speaking about demographics etc., that was something which hasn't been implemented but it was designed in from the get-go. So, anybody could upload their demographic data and we could then use that in order to query data; so that's already built in. In terms of the masking, the reason everything is automatically masked right now is because that's how I wrote the script, because we were treating security as a higher priority. That's up to me to determine. […] There's a flag of default masking or default not-masking. The intention is that for everything that I could reasonably expose without giving away my personal information, I would do so, but that is something that we would need probably some level of AI to kind of detect if it's a row that's just generic words or if it might have an account ID in it. So, for example, a row with discovery information usually has some kind of account identifiers in it. That's one aspect. The second aspect – and this is something that we didn't implement but that we talked about a lot […] while coming up with this project – is that this is now built in a way where it's not a big leap to build in automatic filtering, so that we could read the transaction rows and determine how to label these transaction rows automatically. In fact, we were specifically talking about having this being a community source thing where you would identify your transactions and say ”Okay this is a pharmacy; this is health; this is sport; this is entertainment; this is education”, whatever, and then we would teach Budgie how to apply filters in a way that when anybody uploads, it would automatically be filtered correctly so that you could use those filters then as part of your query. So, you would query for certain demographics: “I want people between 20 and 25 and living in Cape Town, preferably in these industries if they've indicated as such and then I want to know what they’re spending on. Groceries? Or I want to know are they spending on something else. So, that was one way of thinking that we had about this in terms of merchant codes and category codes and that kind of thing. Budgie was initially built with Investec in mind, but it was already built with the intention of being able to collect data from any bank and not necessarily just traditional banks. So, we would like to be able to manage that in a way that it wouldn't matter whether you were downloading and uploading directly from your Investec account or getting it from an FNB CSV like I’m doing or whatever else … OCR on PDFs. But that we would have some way of automatically detecting categories and doing some kind of auto-labelling that wouldn't impose itself on your personal labels, [and] would let you do them together but enable doing the kind of queries that you guys are talking about.
Huggs: (33:26)
Just remember that it's mainly the descriptions that we're masking, so if it's a transaction label and the labels are there, we can still do all the filtering and the querying. […] To Adam's point, whoever is doing the transaction will put whatever description they want. If they decide to stick your account number in there, we can't really do it. So, security in mind, mask it first and then allow the person to change it. There's also obviously the opportunity for them to change it before uploading it. If, like Adam says, you want everything unmasked by default and you've got some fancy way of fixing your labels or fixing your description so that they all don't have any sensitive data, then that's an option. But from a filtering, querying, grouping, slicing and dicing [point of view], we could do that with labels, if that were put in and again, merchants’ codes or any other non-identifying information. The idea was to keep the identifying information as secure and as blocked as possible and very explicitly saying it's a very explicit decision by the user to say I want this information shown. We could obviously put in warnings and stuff like that in fancy UIs, but it was an explicit decision by whoever was using the platform to say I’m going to allow this to be seen.
Adam: (35:00)
Two things to that: the first is, I don't [know] if you guys noticed how quick it is to mask and unmask. That was deliberate, so that you could literally go down the line unmasking things that are irrelevant and adding new mask descriptions to anything that you don't want masked. The second thing is, I’m sure you guys noticed that we are not UI guys, that the UI that we have is not absolutely stunning and MVP ready. But the idea is that we have built a back end that can support all of these things. Whether it's you know a solution to […] to feed things into the Budgie, a solution to interact with Budgie, those are things that we absolutely welcome other people jumping in and saying, “Hey I’ve got a better way of doing this”. Go ahead.
The idea was just to be able to collect this information and present it in a way that we can actually have this conversation in the first place.
Get involved in the Programmable Banking Community
If you have questions or just want to say hi to the Programmable Banking Community leaders, you can pop us a mail and we will get back to you.
If you want to see more from what the community has been up to, you can: