Awesome

Awesome Deep Learning Project Ideas

A curated list of practical deep learning and machine learning project ideas

30+ ideas
Relevant to both the academia and industry
Ranges from beginner friendly to research projects

Hackathon Ideas - Project ideas unlocked by use of Large Language Models, specially text to text -- note that a lot of the text to text ideas can also be buit a lot better with LLMs now!
Text - With some topics about Natural language processing
Forecasting - Most of the topics in this section is about Time Series and similar forecasting challenges
Recommendation Systems
Vision - With topics about image and video processing
Music and Audio - These topics are about combining ideas from language and audio to understand music
Conclusion

Hackathon Ideas

Developer Ideas
- Text to cmd for terminal: Take user intent in terminal e.g.
```
$ask "how to list all files with details"
> Execute "ls -l"? [y/N] y 
$ls -l
```
- Build and edit YAMLs using natural language e.g. Kubernetes and other form of config files
  - Kor for ideas on how this is done for JSON
  - Can be use-case specific. Build pipelines? Kube?
- Mobile Android/iOS SDK for Stable Diffusion inference
  - Apple has released a CoreML Stable Diffusion Inference
Voice powered Experiences
- Audio Conversation with chatGPT, can combine with fast Text-to-Speech e.g. Eleven Labs to have a two-way conversation
- Telegram/WhatsApp bot to get audio and save as text with metadata into mem.ai or Roam Research or Obsidian
Edit image by giving instructions of what you want to do: SeeChatGPT and playgroundai.com as examples
- The underlying mechanism which you can use is called InstructPix2Pix
Semantic search over any media
- Can build using CLIP or BLIP-2 embeddings for images and CLAP for all audio including music and speech
Text to Music Generation
- See MusicLM for reference
Knowledge Base QA aka Answer Engines
- Take any plaintext dataset e.g. State of the Union address and build on top of that
- Can use this over Video Subtitles to search and QA over videos as well, by mapping back to source
Guided Summarisation/Rewriting
- Take specific questions which the user might have about a large text dataset e.g. a novel or book and include that in your summary of the piece
- Pay attention to specific entities and retell the events which happen in a story with attention to that character
ControlNet + Stable Diffusion for Aethetic Control
- Build tooling using diffusers which takes in a set of photos, finetunes a model (LoRA) on a person, detects face and moves it to a new aesthetic e.g. futuristic neon punk, grunge rock, Studio Ghibli. Can also add InstructPix2Pix to give user more control.
Text to Code/SQL
- Use code understanding to convert use query to SQL or another executable programming language, including Domain Specific Languages
- Here is an example of the same: qabot

Text

Autonomous Tagging of StackOverflow Questions
- Make a multi-label classification system that automatically assigns tags for questions posted on a forum such as StackOverflow or Quora.
- Dataset: StackLite or 10% sample
Keyword/Concept identification
- Identify keywords from millions of questions
- Dataset: StackOverflow question samples by Facebook
Topic identification
- Multi-label classification of printed media articles to topics
- Dataset: Greek Media monitoring multi-label classification

Natural Language Understanding

Sentence to Sentence semantic similarity
- Can you identify question pairs that have the same intent or meaning?
- Dataset: Quora question pairs with similar questions marked
Fight online abuse
- Can you confidently and accurately tell whether a particular comment is abusive?
- Dataset: Toxic comments on Kaggle
Open Domain question answering
- Can you build a bot which answers questions according to the student's age or her curriculum?
- Facebook's FAIR is built in a similar way for Wikipedia.
- Dataset: NCERT books for K-12/school students in India, NarrativeQA by Google DeepMind and SQuAD by Stanford
Automatic text summarization
- Can you create a summary with the major points of the original document?
- Abstractive (write your own summary) and Extractive (select pieces of text from original) are two popular approaches
- Dataset: CNN and DailyMail News Pieces by Google DeepMind
Copy-cat Bot
- Generate plausible new text which looks like some other text
- Obama Speeches? For instance, you can create a bot which writes some new speeches in Obama's style
- Trump Bot? Or a Twitter bot which mimics @realDonaldTrump
- Narendra Modi bot saying "doston"? Start by scrapping off his Hindi speeches from his personal website
- Example Dataset: English Transcript of Modi speeches

Check mlm/blog for some hints.

Sentiment Analysis
- Do Twitter Sentiment Analysis on tweets sorted by geography and timestamp.
- Dataset: Tweets sentiment tagged by humans

Forecasting

Univariate Time Series Forecasting
- How much will it rain this year?
- Dataset: 45 years of rainfall data
Multi-variate Time Series Forecasting
- How polluted will your town's air be? Pollution Level Forecasting
- Dataset: Air Quality dataset
Demand/load forecasting
- Find a short term forecast on electricity consumption of a single home
- Dataset: Electricity consumption of a household
Predict Blood Donation
- We're interested in predicting if a blood donor will donate within a given time window.
- More on the problem statement at Driven Data.
- Dataset: UCI ML Datasets Repo

Recommendation systems

Movie Recommender
- Can you predict the rating a user will give on a movie?
- Do this using the movies that user has rated in the past, as well as the ratings similar users have given similar movies.
- Dataset: Netflix Prize and MovieLens Datasets
Search + Recommendation System
- Predict which Xbox game a visitor will be most interested in based on their search query
- Dataset: BestBuy
Can you predict Influencers in the Social Network?
- How can you predict social influencers?
- Dataset: PeerIndex

Vision

Image classification
- Object recognition or image classification task is how Deep Learning shot up to it's present-day resurgence
- Datasets:
  - CIFAR-10
  - ImageNet
  - MS COCO is the modern replacement to the ImageNet challenge
  - MNIST Handwritten Digit Classification Challenge is the classic entry point
  - Character recognition (digits) is the good old Optical Character Recognition problem
  - Bird Species Identification from an Image using the Caltech-UCSD Birds dataset dataset
- Diagnosing and Segmenting Brain Tumors and Phenotypes using MRI Scans
  - Dataset: MICCAI Machine Learning Challenge aka MLC 2014
- Identify endangered right whales in aerial photographs
  - Dataset: MOAA Right Whale
- Can computer vision spot distracted drivers?
  - Dataset: State Farm Distracted Driver Detection on Kaggle
Bone X-Ray competition
- Can you identify if a hand is broken from a X-ray radiographs automatically with better than human performance?
- Stanford's Bone XRay Deep Learning Competition with MURA Dataset
Image Captioning
- Can you caption/explain the photo a way human would?
- Dataset: MS COCO
Image Segmentation/Object Detection
- Can you extract an object of interest from an image?
- Dataset: MS COCO, Carvana Image Masking Challenge on Kaggle
Large-Scale Video Understanding
- Can you produce the best video tag predictions?
- Dataset: YouTube 8M
Video Summarization
- Can you select the semantically relevant/important parts from the video?
- Example: Fast-Forward Video Based on Semantic Extraction
- Dataset: Unaware of any standard dataset or agreed upon metrics? I think YouTube 8M might be good starting point.
Style Transfer
- Can you recompose images in the style of other images?
- Dataset: fzliu on GitHub shared target and source images with results
Chest XRay
- Can you detect if someone is sick from their chest XRay? Or guess their radiology report?
- Dataset: MIMIC-CXR at Physionet
Clinical Diagnostics: Image Identification, classification & segmentation
- Can you help build an open source software for lung cancer detection to help radiologists?
- Link: Concept to clinic challenge on DrivenData
Satellite Imagery Processing for Socioeconomic Analysis
- Can you estimate the standard of living or energy consumption of a place from night time satellite imagery?
- Reference for Project details: Stanford Poverty Estimation Project
Satellite Imagery Processing for Automated Tagging
- Can you automatically tag satellite images with human features such as buildings, roads, waterways and so on?
- Help free the manual effort in tagging satellite imagery: Kaggle Dataset by DSTL, UK

Music

Music/Audio Recommendation Systems
- Can you tell if two songs are similar using their sound or lyrics?
- Dataset: Million Songs Dataset and it's 1% sample.
- Example: Anusha et al
Music Genre recognition using neural networks
- Can you identify the musical genre using their spectrograms or other sound information?
- Datasets: FMA or GTZAN on Keras
- Get started with Librosa for feature extraction

FAQ

Can I use the ideas here for my thesis? Yes, totally! I'd love to know how it went.
Do you have any advice before I start my project? Advice for Short Term Machine Learning Projects by Tim R. is a pretty good starting point!
How can I add my ideas here? Just send a pull request and we'll discuss?
Hey, something is wrong here! Yikes, I am sorry. Please tell me by raising a GitHub issue.

I'll fix it as soon as possible.

Acknowledgements

Problems are motivated by the ones shared at:

Credit

Built with lots of keyboard smashing and copy-pasta love by NirantK. Find me on Twitter!

License

This repository is licensed under the MIT License. Please see the LICENSE file for more details.