Crypto Price Predictor using Twitter Sentiment Analysis

Ruby Zhang

Advisor: Luisa Pereira

A full-stack data platform for crypto investors that does real-time suggestive crypto price movement indicative prediction based on historical prices and Twitter sentiment analysis, using Random Forest.

Project Website Presentation
Dashboard data visualization for crypto market cap

Project Description

This project features an AI-powered full-stack data platform designed for people how are interested in data. The platform does real-time suggestive crypto price movement indicative prediction based on historical market prices and Twitter sentiment analysis data. This project consists of three main focuses: Designing and Engineering frontend and backend architectures, and automated real-time data pipelines; Drawing insights from crypto and text data available from public APIs including realtime crypto price API and Twitter API; Building and training deep learning models with random forest, and evaluating model performance.

Technical Details

The server uses Fast API, the highest performance Python server framework due to its power of asynchronous programming. The user interface leverages Next JS, which provides Server Side Rendering (SSR) as a solution for a better user experience. The UI Styling is implemented by Tailwind CSS. The metadata database is supported by PostgreSQL, which excels at handling structured data, which is often the case with metadata. The time-series data is stored in MongoDB, a flexible document model that allows for easy adaptation to evolving time-series data structures.

Research/Context

1. Kaito.ai is an advanced Web3 information platform that leverages artificial intelligence to provide comprehensive insights and analytics for the cryptocurrency and blockchain. Kaito.AI features market sentiment tracking methodology including: general market sentiment, sector-specific sentiment, token-specific sentiment. It also highlights data visualization design for sentiment analysis dashboards that provides user-friendly big data insights, the TL:DR feature that summarizes information on narratives and tickers (tokens) using the metasearch engine, summarization of podcasts and Twitter spaces which extracts the scoop from each episode of your favorite podcasts.

2. KryptoOracle is designed to provide real-time cryptocurrency price predictions by analyzing Twitter sentiments. It features the real-time data pipelining methodology for price prediction including, real-time data collection methodology, data preprocessing using VADER for sentimental analysis, feature engineering methodology to generate sentiment analysis FinalScore. It also features real-time prediction methodology including model retraining methodology by evaluating the loss between real-time predicted data vs. actual data.

3. The paper Predictive analysis of Bitcoin price considering social sentiments focuses on Reddit sentimental analysis used on Bitcoin price prediction. The author conducted 17 model training experiments on different deep learning model structures. This paper talks about data preprocessing methodology using pandas and numpy, machine learning model evaluation methodology using RMSE (Root Mean Square Error), sentimental Analysis methodology using Flair and Textblob. In this paper, LSTM model with data attributes listed as followings is listed as the highest prediction accuracy: Open price, day high price, day low price, close price, day volume, news polarity score, news subjectivity score, news negativity score, news positivity score, reddit polarity score, reddit subjectivity score, reddit positivity score, reddit negativity score.