I am currently working as a Junior Data Scientist improving my skills to become an AI researcher, and engineer. I am a philomath, love sports (especially Cricket, Formula 1, and Football) and trying new food and learning about new things.
You can reach me at chinmay.ch33@gmail.com
Some things about me:
- Currently living in Noida, India. Grew up in Balotra, Rajasthan India a small town near Jodhpur, Rajasthan.
- Got my Bachelors of Science in Mechanical Engineering at UIC.
Currently working at Oriserve, focusing on in house Speech to text and Text to speech system development.
Currently working on:
-
I am currently learning the Rust programming language to advance my skills in software development.
- To learn rust I am working on a project where I am creating a client side api server using which requests can be made to a Nvidia Triton server running the OpenAI Whisper model.
- The first step in doing this is by creating endpoints which can transcribe static audio files and return the transcription.
- Once the static file transcription is done. I will improve it so that the transcription of the file can be done in batches and the transcription can be streamed back to the user
- Finally I want to create endpoints which can transcribe live audio streams and return the transcription. These can be online streams or streams through a microphone.
Research Interests:
-
Small Language models
- I am interested in conducting research on decomposing large language models into specialized, smaller components
- I am highly interested in checking out the feasibility of breaking down comprehensive language models into smaller, task-specific models that can operate with reduced computational requirements while maintaining high performance in their specialized domains
- Development of methodologies for training and deploying lightweight, offline-capable AI systems that can function effectively without constant internet connectivity, particularly focusing on applications like personal home assistants
- Investigation of resource optimization techniques that would allow AI systems to run efficiently on edge devices, reducing dependency on cloud infrastructure while maintaining functionality and performance
-
End to End conversational AI agents
- I am interested in exploring the potential of AI agents that can engage in natural language conversations with users, providing personalized and contextually relevant responses
- These agents could be used in various applications, such as customer service, healthcare, education, and entertainment, where they can provide tailored and interactive experiences to users
- Research on the development of AI agents that can understand and respond to complex queries, context, and intent, enabling them to provide accurate and relevant information
- These agents can also bridge the gap between human communication where they can be used for real time translation, summarization, and question answering.
Finished Projects:
Stock Price Recommender [expand]
- Conducted a time-series analysis using the ADF Fuller test, identifying trends and seasonality in the stock price data.
- Developed a Recurrent Neural Network (RNN) model using Long Short-Term Memory (LSTM) for feature extraction, enabling the model to capture the temporal dependencies in the data.
- Attained a Mean Percentage Error of 2.78% in the univariate model and 0.8% in the multivariate model, maintaining consistent performance across models.
End to End Data Collection and Processing Pipeline [expand]
- Created a high-performance, streaming-enabled pipeline using kafka and Prefect, enabling efficient data collection and processing.
- Wrote python scripts to process and store collected data into a database, ensuring data is collected, processed, and stored in a structured and efficient manner.
- Used faker library to generate fake data for testing purposes, allowing for the creation of realistic test data.
Byte Pair Encoding and Tokenization [expand]
- Wrote a Go program to implement Byte Pair Encoding (BPE) and Tokenization (T5).
- Worked on this project while learning Go, gaining hands-on experience in developing and maintaining software projects.
- Developed a way to train the encoder using text and byte-pair data, ensuring the encoder can effectively encode and tokenize text data.
- Implemented a tokenizer using the BPE encoder, enabling the tokenizer to efficiently tokenize text data.
Healthapp Log Analysis [expand]
- Analysed healthapp log data to identify patterns and trends in usage.
- Created visualizations and dashboards to present the data in a user-friendly and informative manner.
- Performed timeseries analysis to identify trends and seasonality in the data for usage and engagement of users.
IPL Elo Ratings [expand]
- Implemented a dynamic system to update cricket team ratings post-match, providing an accurate and current reflection of team strengths.
- Analyzed all IPL teams, delivering in-depth insights into their Elo ratings, match records, wins, ties, and win ratios, contributing to a deeper understanding of team performances.
- Developed a visual representation of Elo rating progressions, allowing users to easily track and analyze team performance over time.
- Crafted visual representations of Elo rating progressions for each team by season, significantly improving data interpretation and presentation quality.
Awards:
Best in Category award for Aerospace and Automotive Engineering at UIC Engineering Expo 2022 [expand]
- Award for senior design projects in the automotive and aerospace engineering department at UIC, which included the design of a new muffler for the formula SAE team at UIC.
- Designed the muffler with a focus on sound attenuation and noise reduction, which resulted in a significant reduction in noise levels and improved performance.
- Performed all the fabrication and testing of the muffler, ensuring its quality and reliability.