Once again a small crowd of Data Scientists has been courageous enough to fight their impulse for just chilling out in the wonderful sunday’s weather in HongKong and instead came to hone their skills on 2 topics :
An exploration of HKEX data and its links to HK financial markets
A study of the very hyped cryptocurrencies
This topic stemmed from the follow-up of the previous “Coindex” subject.
The study of correlation should give an idea of how much diversification would be important in a portfolio or index of crypto-currencies, in other words, how much an index would provide a sense of the true performance of the currencies in the crypto world.
Here the focus has been given to a classical-flavored study of correlation among the currencies available on Poloniex Exchange on sep 16th, 2017.
First of all a joyplot retrieved the shapes of return distributions for many currencies :
Some currencies such as OMG (OmiseGo) and CVC (Civic) are too new and then have a short historics that meks them not at all normally distributed, and are then considered as outliers and removed from the scope.
Then we came up with proper correlation calculations
And we can get a 36% global average correlation (average of all 1 to 1 correlations), hinting that diversification could be an important driver of portfolio efficiency.
If we graph this measure along time, we see that the correlation tends to increase along time, suggesting that there is some re-correlation of crypto markets.
Next step might be to understand why this re-correlation happens.
The complete analysis, including the used data, can be found on github.
Following the success of our first event, we again met up at the MakerHive in Kennedy Town for our un-hackathon. This is our term for a hackathon where the agenda would be set by participants and people would have fun coding together, instead of being a competition. It’s a way to improve your skills and share projects you are passionate about with the community.
Some projects from our previous event were pitched again while a number of new projects were also started. After teams were formed, the coding quickly got under way.Attendees gathered for the presentation as the teams showed off their results.
A initiative to scrape public data with Python and R, Scrapy was used to pull HKEX data.
Visualisation of the block chain
On 12th May, computers worldwide were hit by the WannaCry ransomware attack. The attackers asked ransom payments to be made to a number of bitcoin wallets. Blockchain data about these wallets from the period of the attack was sourced and visualised using D3.
Horse racing prediction
“Anomalies” in betting market for horse racing mean that the outcome of a horse race could be predicted. RapidMiner and Python was used to scrape the data and create a predictive model.
The team were well organised and even produced a presentation of their results!
This team scraped data on traffic incidents using Scrapy (Python) and then visualised using R.
Crypto-currencies investment strategies
This project is a follow-up of the previous unhackathon, at the end of which we remained puzzled by some unexplainable moves in certain currencies.
This time we had better grasp at it and we went for analysing correlations and properties of simple indices made of a basket of currencies.
The global correlation among 20 first currencies amounted to 36% since 2017
this is low enough to hope for some diversification effect to take place.
Building an index where each currency has the same weight is indeed providing a real overperformance if we consider BTCUSD as the benchmark.
Moreover scaling down the index so that volatility, or risk, is equivalent to the one of Bitcoin vs USD then produces significant gain of 15% over BTC.
On top of this the skew while negative for Bitcoin becomes positive for the index : this means that frequent small losses encountered by the index are compensated by less frequent big much bigger gains !
This is encouraging to build up some other indices and strategies, and this project could yield to promising applications :
Trading strategies, either short or medium term, dynamic or static, including machine learning algorithms for the discovery of alpha in this market
The development of an algorithmic trading tools following these strategies
Also some online analytics on single currencies or portfolio of them
Potentially some advisory for portfolio construction
Data Science Hong Kong was set up to as a way for people interested in data science to network and share ideas. We have an active public Slack group where people regularly share articles and discuss all things tech and data science. The group has organised a number of informal meetups before but we wanted to a start a regular event based around coding and presenting, and not just on talking and networking.
There are many IT, tech and data science events in Hong Kong but they are infrequent and often serve primarily as a marketing or recruitment tool. Not satisfied with the state of tech events in Hong Kong, we set out to create an event that was started from the bottom up and would focus on who knew the most and not who spoke the loudest, which is inviting to beginners but not to those uninterested in technical details.
We have therefore started a regular unhackathon. This is our term for a hackathon where the agenda would be set by participants and people would have fun coding together, instead of being a competition. It’s a way to improve your skills and share projects you are passionate about with the community.
Our first event gets under way
Our first gathering was made possible by The Hive. They were very keen on supporting the data science community in Hong Kong and let us use the MakerHive in Kennedy Town which was a fantastic venue for our first event.
The event started with the floor being opened to pitches. After signing up for a slot by putting up a post-it, pitchers were given 5 minutes to convince others to work on their project.
There were many great ideas and teams were formed around those that attracted enough interest. Discussions were soon under way on what each team wanted to achieve by the end of the day.
Of course, being a hackathon, there was coding, coding and more coding!
As it became time for lunch, teams headed out to Kennedy Town center to find a restaurant. Any loss of coding output was more than made up for by the opportunity that people got to better know their teammates. Real data scientists don’t skip lunch!
4 hours and much coding later the deadline for presentations loomed. All the teams gladly accepted a 20 minute grace period to put the final touches on their work.
Some of the projects presented were :
Address mapping in Hong Kong
Twitter topic analysis
This team aimed at building an index of cryptocurrencies similar to usual financial market indices, to be used as a benchmark of refined to explore portfolio strategies.
Facial Expression Recognition using Keras
The team of 3 used a MNIST convolutional neural network model and retrained it on facial expression data from Kaggle, with 55% accuracy over 7 categories
Everyone had made great progress on their projects and a common theme across presentations was that so much more could have been accomplished with just a bit more time. It’s good then that we already have started planning for our next event in September!
Just because the event is over does not mean the coding stops! If you enjoyed the project you worked on or more importantly enjoyed the people you worked with then do continue collaborating and share with us what you did at our next event!
If this event seems interesting then please contact us by email, social media or join our slack group. We’ll keep you updated there about any future events.