This is a guest post by Austeja Petroskeviciute, chair of the recently launched Bath Machine Learning Meetup. Austeja has provided an introduction to the group and some background on their parking data project.
It’s been four months since the Bath Machine Learning Meetup (BMLM) was launched and we’ve had many great meetups so far. We’ve had talks on introductory Machine Learning, CRISP-DM methodology, Kaggle competitions, and Recommender Systems. The best thing about these meetups is that any small question or opinion raised opens up discussions in the room and it’s a great opportunity to find out something new.
The main purpose of BMLM is to create these opportunities to share experiences and learn from each other. Thus we also started a Machine Learning project “Predicting Parking Spaces in Bath”, to practice using large sets of data and applying Machine Learning algorithms.
We have two teams working one each with R and Python, and our main data source is the BANES Historic Car Park Occupancy data provided by Bath: Hacked. Using parking records from 8 car parks our aim is to apply Machine Learning algorithms to predict parking occupancy in Bath. Because the data is large, with 1.6 million records, we picked out 7 features to use, namely the name of the car parks, date of last update, date uploaded, occupancy, capacity, percentage, and status.
Currently we are at the stage of investigating the data to find any general trends or missing numbers. The fact that data preparation accounts for about 80% of the work of a data scientist wasn’t a myth. We have to think what to do with duplicated or missing data, and we learned that the best option is to plot it and see it yourself. As featured in Bath: Hacked’s previous blog post Owen Jones from team R has beautifully summarised his findings here.
We have a wide range of people in our team including students studying mathematics or computer science, software engineers, and even professional data scientists. For some it is a chance to brush up coding skills, while for others it is a chance to put theory into practice. There is always somebody available to answer questions, and everybody is willing to help.
Since this project is purely for learning, we haven’t put too much emphasise on accuracy at the moment. However, as we progress to the stage of actually applying machine learning algorithms we believe we can experiment with various techniques to make better predictions.
We warmly welcome anybody to join our team at any point – you can drop in once or dig into the project together. We appreciate any help from experts to navigate our project too! For more meetups from BMLM please see our meetup page.
We’re looking forward to seeing the results of the work and can hopefully publish the cleaned-up data for others to use. If you’re working on a local project using data from our store (or elsewhere) then please get in touch, we’d love to share your work!