Un-hackathon #10

Our 10th Hackathon for Data Science: a full day of fun and working together on YOUR data science projects!

At this event attendees will have the chance to pitch their projects, or join other people’s. And in the beginning of the day we will host some fantastic industry specialists to share their experiences operating in the data science field.

Signup at: Eventbrite, Meetup, Facebook

The event will be held at the South China Morning Post offices at Times Square

 

Schedule of events:

9.30am – Arrive, registration
10am – Welcome
10.15am – Talks begin
11.30am – Pitch session, recruitment
12pm – Work on projects
5.30 pm – Present results of work session

Location:

SCMP: 20/f, Tower 1, Times Square, 1 Matheson St, Causeway Bay

Requirements:

Laptop / charger for those joining the coding
Prepared data, and projects pitches for the ones submitting projects
If presenting, send us your presentation slides ahead of time so we can prepare them.
50HKD in cash for the space rental

Recommendations for project submissions:

Send us your presentation slides! Drop a link to one of the organisers on Slack or another way. We want to minimise time spent switching laptops so we will run your slides from our pc.
Prepare data in advance as much as you can; spending the day cleaning or retrieving data won’t gather crowds of DS! Contact organisers if you need a data repository to share data with all your team members.
If the project is already underway, prepare an introduction to it so that people can join. If you’re presenting slides, send them to us before you arrive, make sure the task you propose is feasible during the time of the event, and describe the skills you expect your team to have: R or Python? AWS, Spark? etc.

For final presentations:

Start writing the final presentation right from the start and add elements little-by-little all day long. Articulate the reason you want to do the project, and the solution. Make it understandable to everyone.
If you wish, your work will be published on this website with your bio, name, etc.

Other details:

50 participants max
Food/drink: Only water, coffee and tea are provided. Attendees can order their own food to the venue, take a break to find a restaurant nearby or bring their own lunch.
Price: 50 HKD. We charge a fee to cover venue and food costs. We are a not-for-profit organisation and will aim to keep the costs of our events as low as possible to make it accessible to all.

Unhackathon #9 roundup

With the World Cup in Russia wrapping up on the same evening as our ninth un-hackathon, football was on our minds, and, with tongue in cheek, Data Science Hong Kong co-organiser Xavier put the question to our cohort of predictive modellers to find the winner, hours before the result was known.

IMG_20180715_165227
We were asked to build a predictive model for the world cup result, but our vote worked well enough. 

Getting down to more serious stuff, Houston Ho presented his company’s work on using machine learning to predict whether an employee is due to leave their position. His tool aims to give human resources teams a score for each employee based on a variety of characteristics. His model is achieving 80% accuracy and he says he can achieve more.

See his presentation below.

DSHK co-organiser Guy Freeman also presented about his new development offering a central repository for scraped data, using an open source philosophy. He showed the system’s potential by using a dataset of property transactions in Hong Kong spanning 20 years.

See his presentation below.

Group projects

Michael attracted the most interest among the group in his restaurant prediction model project. Using data from two restaurant booking systems, he aimed to predict how busy a restaurant would get using a machine learning model.

See the results from the group’s work in the slides below.

Forecasting Visitors of Restaurants

Visiting from Tokyo, Suzana Ilic brought exciting skills to the unhackathon, and decided to set her sights on unpicking hype in the crypto space. She was a bit camera shy so no video but you can see her slides below.

Quantifying Hype

Image from iOS

Morris Wong aimed to build an auto-tagging system for publishing, along the lines of taggernews. This tech could have wide-scale application when it’s up and running. See his presentation below.

Pocket exploration

That’s it for this month’s event. We will be announcing our next event shortly.

Unhackathon #8

Our 8th Hackathon for Data Science: a full day of fun and working together on YOUR data science projects!

At this event attendees will have the chance to pitch their projects, or join other people’s. And in the beginning of the day we will host some fantastic industry specialists to share their experiences operating in the data science field.

Signup at: Eventbrite, Meetup, Facebook

naked-hub

Schedule of events:

9.30am – Arrive, registration
10am – Welcome
10.15am – Talks begin
11.30am – Pitch session, recruitment
12pm – Work on projects
5.30 pm – Present results of work session

Location:
16F, 40-44 Bonham Strand, Sheung Wan, Hong Kong

Requirements:

Laptop / charger for those joining the coding
Prepared data, and projects pitches for the ones submitting projects
If presenting, send us your presentation slides ahead of time so we can prepare them
50HKD in cash for the space rental

Recommendations for project submissions:

Send us your presentation slides! Drop a link to one of the organisers on Slack or another way. We want to minimise time spent switching laptops so we will run your slides from our pc.
Prepare data in advance as much as you can; spending the day cleaning or retrieving data won’t gather crowds of DS! Contact organisers if you need a data repository to share data with all your team members.
If the project is already underway, prepare an introduction to it so that people can join. If you’re presenting slides, send them to us before you arrive, make sure the task you propose is feasible during the time of the event, and describe the skills you expect your team to have: R or Python? AWS, Spark? etc.

For final presentations:

Start writing the final presentation right from the start and add elements little-by-little all day long. Articulate the reason you want to do the project, and the solution. Make it understandable to everyone.
If you wish, your work will be published on this website with your bio, name, etc.

Other details:

50 participants max
Food/drink: Only water, coffee and tea are provided. Attendees can order their own food to the venue, take a break to find a restaurant nearby or bring their own lunch.
Price: 50 HKD. We charge a fee to cover venue and food costs. We are a not-for-profit organisation and will aim to keep the costs of our events as low as possible to make it accessible to all.

Unhackathon #7 round-up: making sense through data

What time do people rent share bikes in San Jose? Houston and a group of data scientists has looked at bike share data in California and made some curious obvservations at our April unhackathon.

We also heard from Nick Lam-wai who is building a database on Hong Kong’s budget, the blueprint of government spending and priorities. And Chris Choy, who was working with Nick also discovered how to take historical PDFs of the budget and read the tables into Nick’s database. Expect big things from this group.

Our second meet up at Accellerate in Sheung Wan started with a discussion of the  Catboost library by Daniil Chepenko, who explains its benefits over other methods such as random forest.

Catboost is a gradient boosting library for work on decision trees, developed by the Russian search engine Yandex, building on many years of development in this field.

See his presentation video below, and follow the slides here.

Projects

Willis sought to find out what makes a Kickstarter project work. He came to the hackathon with data from 2009-2017, and a trained model with 60% accuracy, up from 30% at the beginning of his work. Knowing whether a Kickstarter will succeed is a huge investment advantage, so watch the short videos to see how well he went.

Pitch:

Conclusion:

Elizabeth Briel and Ben Davis have been seeking new ways to tell the story of global warming’s effects on arctic sea ice, and came to the hackathon with data they wanted to turn into a song. See the results below.

Slides are here.

Pitch:

Conclusion:

Nick Lam-wai created a thorough database of the Hong Kong budget, turning it from a human readable collection of documents back into one ready for machine analysis.

Slides here.

Pitch:

Conclusion:

Overwatch strategies revealed with data science

Ram de Guzman presented this analysis of Overwatch team strategies using scraped data from Winston’s Lab (which gathers it directly from game videos). His insight revealed how the best teams in South Korea arranged their teams and fought.

In the video he describes the process of gathering his data, then shows in impressive visualisations how that data relates to actual game strategy.

Watch his talk at our 6th unhackathon in March here:

 

And you can follow his project here.

November Unhackathon

Our 3rd event !

Once again a small crowd of Data Scientists has been courageous enough to fight their impulse for just chilling out in the wonderful sunday’s weather in HongKong and instead came to hone their skills on 2 topics :

  • An exploration of HKEX data and its links to HK financial markets
  • A study of the very hyped cryptocurrencies

Crypto-currencies correlation

This topic stemmed from the follow-up of the previous “Coindex” subject.
The study of correlation should give an idea of how much diversification would be important in a portfolio or index of crypto-currencies, in other words, how much an index would provide a sense of the true performance of the currencies in the crypto world.

Here the focus has been given to a classical-flavored study of correlation among the currencies available on Poloniex Exchange on sep 16th, 2017.
First of all a joyplot retrieved the shapes of return distributions for many currencies :
ridge_plot.jpegSome currencies such as OMG (OmiseGo) and CVC (Civic) are too new and then have a short historics that meks them not at all normally distributed, and are then considered as outliers and removed from the scope.

Then we came up with proper correlation calculations

heatmap.png

And we can get a 36% global average correlation (average of all 1 to 1 correlations), hinting that diversification could be an important driver of portfolio efficiency.

If we graph this measure along time, we see that the correlation tends to increase along time, suggesting that there is some re-correlation of crypto markets.

histocorrel.png
Next step might be to understand why this re-correlation happens.

The complete analysis, including the used data, can be found on github.

 

Our first event: Unhackathon at the Hive

hackathonDSHK

What is an Unhackathon anyway?

Data Science Hong Kong was set up to as a way for people interested in data science to network and share ideas. We have an active public Slack group where people regularly share articles and discuss all things tech and data science. The group has organised a number of informal meetups before but we wanted to a start a regular event based around coding and presenting, and not just on talking and networking.

There are many IT, tech and data science events in Hong Kong but they are infrequent and often serve primarily as a marketing or recruitment tool. Not satisfied with the state of tech events in Hong Kong, we set out to create an event that was started from the bottom up and would focus on who knew the most and not who spoke the loudest, which is inviting to beginners but not to those uninterested in technical details.

We have therefore started a regular unhackathon. This is our term for a hackathon where the agenda would be set by participants and people would have fun coding together, instead of being a competition. It’s a way to improve your skills and share projects you are passionate about with the community.

Our first event gets under way

Our first gathering was made possible by The Hive. They were very keen on supporting the data science community in Hong Kong and let us use the MakerHive in Kennedy Town which was a fantastic venue for our first event.

The event started with the floor being opened to pitches. After signing up for a slot by putting up a post-it, pitchers were given 5 minutes to convince others to work on their project.

OLYMPUS DIGITAL CAMERA

There were many great ideas and teams were formed around those that attracted enough interest. Discussions were soon under way on what each team wanted to achieve by the end of the day.

 

Of course, being a hackathon, there was coding, coding and more coding!

 

As it became time for lunch, teams headed out to Kennedy Town center to find a restaurant. Any loss of coding output was more than made up for by the opportunity that people got to better know their teammates. Real data scientists don’t skip lunch!

Presentation time

4 hours and much coding later the deadline for presentations loomed. All the teams gladly accepted a 20 minute grace period to put the final touches on their work.

 

Some of the projects presented were :

  • Address mapping in Hong Kong
  • Twitter topic analysis
  • Crypto-currency analysis
    2017_08_25_16_11_50_Coindex_Google_Slides.jpg
    This team aimed at building an index of cryptocurrencies similar to usual financial market indices, to be used as a benchmark of refined to explore portfolio strategies.

 

  • Facial Expression Recognition using Keras
    内嵌图片 3
    The team of 3 used a MNIST convolutional neural network model and retrained it on facial expression data from Kaggle, with 55% accuracy over 7 categories

 
Everyone had made great progress on their projects and a common theme across presentations was that so much more could have been accomplished with just a bit more time. It’s good then that we already have started planning for our next event in September!

Just because the event is over does not mean the coding stops! If you enjoyed the project you worked on or more importantly enjoyed the people you worked with then do continue collaborating and share with us what you did at our next event!

If this event seems interesting then please contact us by email, social media or join our slack group. We’ll keep you updated there about any future events.

Data Science Hong Kong