Our fifth un-hackathon kicked off the year with 30 eager data scientists attending. The event at Makerhive in Kennedy town was the first of the year and combined industry talks with project based collaboration.
As the mercury dipped outside the fires of creativity burned bright among our attendees. Five project leaders suggested projects to focus the skills of our data scientists on uncovering new insights into the property market in Hong Kong, with a 1.6 million row record of transactions over the past 20 years. Another project aimed to discover whether public data can spot a scam initial coin offering, or ICO.
Pranav Agrawal, an HK University of Science and Technolology student, presented a code tutorial on Multi-layer perceptrons in PyTorch. The in-depth, code-centric tutorial took us step by step through the process. We can share links to the documentation here:
Hang Xu presented his method of looking at DNA using the word-to-vector model. He said his method of adapting the word2vec model to analyse DNA was superior to the best usage of the current method of analysing DNA using a one-hot vector method.
- Guy’s property data analysis
- Jenson’s ICO scam detector
- Kirill’s Ansible machine learning speed booster
Property data analysis
Using a 1.2gb table of 1.6 million property transactions in Hong Kong, from 1997 to today, this group looked for trends and insights in the property market. Some of the central questions were quantifying the rate that property prices were growing in relation to wage growth in the city.
They found some bargains, even in the current market. See their presentation with their findings.
Ansible speed boosting for NumPy and R
A lot of machine learning tools depend on matrix manipulation libraries, e.g. NumPy. In a basic configuration it uses CPU for linear algebra computations, such as matrix multiplication, SVD or Eigenvalues decomposition. OpenBLAS speeds computations 4-10x via Fortran binding.
Is this ICO a scam?
The group pulled a list of over 1600 ICOs from the past two years, and with the question of whether they could establish whether it is a scam, evaluated their value. The second step was to gather the return on investment for each of the ICOs, and the countries they were reported to have come from.
Morris Wong worked on scraping a dataset to build a structured system to help jobseekers vet a company before joining. Using stealjobs.com data he aims to build an explorer in the shape of GitXplore using four metrics: income, working hours, promotion prospect, happiness. The data is user generated.
See you all at our next event in March.