Uncovering trends in text through the Data Science Accelerator

February 13, 2024 by No Comments | Category Data, Digital Scotland

Michael Shaw, Scottish Legal Complaints Commission

Guest blog post by Michael Shaw, Scottish Legal Complaints Commission.

As a public sector data analyst, it’s frustrating when you think you’ve got a good idea to solve a business problem, but don’t have enough time or technical support to explore it fully. That was certainly the case for me when it came to trying to unlock a particular issue for my organisation.

Every year over 1000 people make a complaint about a lawyer to the Scottish Legal Complaints Commission. While some structured data is captured on these complaints in our case management system, lots of it is only available in Word or PDF documents attached to the record. That means valuable information and trend data is locked away in these files with no systematic way of exploring their content.

I’d been trying to do text analysis in R to understand what was held in these files but could only devote a small amount of time to this problem which meant any progress was limited. However the opportunity to apply for the Data Science Accelerator handed me the key I needed.

We developed an accelerator project proposal to find methods to extract data from these documents and potentially identifying categories within the information they held. The application was accepted and by September 2023 the project was underway.

As part of the accelerator programme, you’re paired with a mentor. In my case I got the chance to be mentored by the amazing Reema Vadoliya of People of Data. It was my first time as a mentee and Reema was an incredible mentor who brought great experience from her data engineering background. Reema and I met weekly throughout the 12 week (really 1 day a week for 12 weeks) programme. We started by planning, drawing out the steps that I needed to do and highlighting unknowns.

Pragmatism is one of the things I learned from being in the accelerator. While I had set my sights on developing a universe of text data, I realised pretty quickly that I had to quell my ambition and instead have a more realistic target for the project. What that translated to, working with Reema and my manager is that we would have at the end some repeatable, commented and documented code which could be a starting point for future work, ultimately sharing learning from complaints back to the legal profession.

The accelerator programme wrapped up in December 2023 and we are going into 2024 with a toolkit that we can use to access and analyse text data – something that was previously beyond our reach.

As well as the things I learned from the technical work itself, I gained experience of managing an innovation project and how to present complex ideas in a way that was easy to understand. I also learned a lot from the weekly meetings with the rest of the accelerator cohort. I would strongly recommend the Data Science Accelerator experience to other public sector data analysts when applications open in Spring 2024.

If Michael’s blog post has inspired you, you can now apply for the 2024 Data Science Accelerator programme on the Scottish Digital Academy’s website. We look forward to hearing your ideas!

Tags: , ,


Leave a comment

By submitting a comment, you understand it may be published on this public website. Please read our privacy policy to see how the Scottish Government handles your information.

Your email address will not be published. Required fields are marked *