How Prostate Cancer UK built its own message bus
The cancer charity received funding ring-fenced for modernising its data management – the funds were spent on a solutions architect and an ETL process
Prostate Cancer UK is using the robotic process automation (RPA) functionality in its existing Toca enterprise developer platform to push data into a processing pipeline for updating its customer relationship management (CRM).
Three years ago, struggling with data processing, the charity hired Gerardo Del Guercio, who previously worked at the Home Office, to fix the problem. His role – solutions architect – was funded by an equity firm, which offered two years of fixed funding. This funding has enabled Prostate Cancer UK to update its end-to-end data management process.
One of the tech challenges it faced was the fact that data management was becoming a bottleneck. As Del Guercio points out: “If you don’t engage with your fundraisers and your donors, then you’re kind of losing the funding race. So data is really important.
“The reason you choose a charity is because it’s close to your heart. You go onto the website and register your interest, fill in a form, and it all looks great, but then suddenly you don’t hear anything back from the charity. What’s happening once all that data has been entered?”
Prostate Cancer UK’s data latency bottleneck restricted how much data could be imported into its legacy CRM system per day. The charity previously spent around 2,200 hours a year manually collecting and processing donation reports and data from different feeds, such as JustGiving and Facebook, transforming the data and importing it into its CRM system.
“We have a team of eight data executives and their job, day in, day out, was just receiving data from third-party organisations or processing online forms,” says Del Guercio. Every day, this team dedicated their time to importing data, which, he says, led to “quite low morale”.
Building a data funnel
The ring-fenced funding meant Del Guercio could look at how to improve the end-to-end data management process. He decided that the right approach would be for Prostate Cancer UK to develop its own extract, transform and load (ETL) pipeline based on the concept of a message broker.
The idea was based on his experience of how the Home Office managed data. Such technology blueprints are normally associated with central government and global organisations such as airlines and financial institutions. Certainly, for Del Guercio, an off-the-shelf message broker would be well beyond his budget. “We haven’t got that kind of money here, so we have to come up with our own bespoke version of a message broker,” he says.
In organisations that simply cannot afford the salaries needed to attract the top tech talent, Gerardo Del Guercio, solutions architect at Prostate Cancer UK, urges IT leaders to think about other motivations that keep people on board.
“If you can't keep people with money, then you've got to have other ways to make their job interesting. Learning something new, to some extent, is like giving them adrenaline. ‘If I stay here for another year, I’m going to be this person and then I could go outside’,” he says.
For instance, younger people who join the charity sector are generally at the beginning of their careers. Prostate Cancer UK is in the third year of a work placement programme where a student at Cardiff university works on Del Guercio’s team and takes full responsibility for the project that extracts data from third-party sites.
Unlike solutions architects in large organisations, who are generally hired to develop a blueprint for the organisation’s IT architecture, Del Guercio says his role was to get the job done using “the right tools for the right job”.
Part of this decision involved assessing Toca’s low-code enterprise development platform, which the charity had already begun using for the whole ETL data management process. Prior to his arrival at Prostate Cancer UK, this was being managed by “a non-techie”.
According to Del Guercio, Toca’s strengths are that it’s on the robotic process automation side.
When he came on board, Del Guercio started by breaking up the ETL process into different components. “I wanted to get away from being supplier-led to owning the intellectual property of the technology solution in-house,” he says. He decided that Toca was best placed for the extract part of the ETL.
Looking at the data transformation, Prostate Cancer UK was already a Microsoft SQL Server customer, and Del Guercio took on developing the scripts required for the transformation side of the ETL process.
The final stage is what Guercio calls “funnelling the data” to load the data from the SQL database into a single pipe that loads the legacy CRM system. “Because each data source had a different import process, we needed a one-to-one relationship to load the data into our CRM,” he says.
This funnel approach to merging the data in the transformation stage means that as the CRM reaches end of life, it can be swapped out without requiring a major reworking of the ETL process.
Prostate Cancer UK currently collects data from more than 30 external agencies. Talking through how the process works in practice, Del Guercio says: “Each morning, any time between midnight and 6.00am, a Toca process will wake up and run data extracts.”
This process collects the data from each of these agencies. Between 6.00am and 7.15am, an importer process is run, which uploads the extracted data into the SQL Server database. SQL Server then runs a transformation on each agency’s data, and the results are then loaded into the CRM system. When people get to work at 9.00am, there is a two-hour window of manual checks, where the automation may have failed. By 11.00am, the new data is available in the CRM system.
According to Del Guercio, the “risk checker” initiative, which Prostate Cancer UK ran with NHS England between February and March this year, would have not been possible without the level of automation that was available through the new ETL process.
Gerardo Del Guercio, Prostate Cancer UK
“The joint campaign was about trying to find the missing men who hadn’t gone in for prostate checks during Covid-19,” he says. The campaign, backed up by TV broadcasts, directed families to the Prostate Cancer UK online risk checker. “We had 550,000 families on the risk tracker.”
At one stage during the campaign, Del Guercio says there were 15,000 concurrent users – a volume at which he says “there’s no way we would have been able to handle the data import [before]”.
The automation and streamlining process for the collection of donation reports has enabled the in-house team at Prostate Cancer UK to refocus its time on more valuable tasks.
In Del Guercio’s experience, this helps to build up the morale of the team by giving people challenges and demonstrating to them that they’ve got a future in developing the technical skills to meet the challenges the organisation faces – “a future where you know they’re learning”.
Read more about data pipelines
- Developing an effective data pipeline process is a key step for organisations to manage data sources, flow and quality. A data pipeline also ensures approved data access.
- DataOps has created a lot of hype as a data management pipeline because of its focus on collaboration and flexibility. Find out how these priorities support your data.