Unhackathon #5: Discovering trends in property data and scams in ICOs

Our fifth un-hackathon kicked off the year with 30 eager data scientists attending. The event at Makerhive in Kennedy town was the first of the year and combined industry talks with project based collaboration.

As the mercury dipped outside the fires of creativity burned bright among our attendees. Five project leaders suggested projects to focus the skills of our data scientists on uncovering new insights into the property market in Hong Kong, with a 1.6 million row record of transactions over the past 20 years. Another project aimed to discover whether public data can spot a scam initial coin offering, or ICO.

Presentations

IMG_20180204_103740_HDR

Pranav Agrawal, an HK University of Science and Technolology student, presented a code tutorial on Multi-layer perceptrons in PyTorch. The in-depth, code-centric tutorial took us step by step through the process. We can share links to the documentation here:

Github

Presentation


Hang Xu presented his method of looking at DNA using the word-to-vector model. He said his method of adapting the word2vec model to analyse DNA was superior to the best usage of the current method of analysing DNA using a one-hot vector method.

Presentation

Projects

  • Guy’s property data analysis
  • Jenson’s ICO scam detector
  • Kirill’s Ansible machine learning speed booster

Property data analysis

Using a 1.2gb table of 1.6 million property transactions in Hong Kong, from 1997 to today, this group looked for trends and insights in the property market. Some of the central questions were quantifying the rate that property prices were growing in relation to wage growth in the city.

They found some bargains, even in the current market. See their presentation with their findings.

Ansible speed boosting for NumPy and R

A lot of machine learning tools depend on matrix manipulation libraries, e.g. NumPy. In a basic configuration it uses CPU for linear algebra computations, such as matrix multiplication, SVD or Eigenvalues decomposition. OpenBLAS speeds computations 4-10x via Fortran binding.

Github

See their presentation here.

Is this ICO a scam?

The group pulled a list of over 1600 ICOs from the past two years, and with the question of whether they could establish whether it is a scam, evaluated their value. The second step was to gather the return on investment for each of the ICOs, and the countries they were reported to have come from.

See their presentation and findings here.

Job explorer

Morris Wong worked on scraping a dataset to build a structured system to help jobseekers vet a company before joining. Using stealjobs.com data he aims to build an explorer in the shape of GitXplore using four metrics: income, working hours, promotion prospect, happiness. The data is user generated.

See you all at our next event in March.