Tanvin Sharma
Software Engineer - NLP and AI Enthusiast
Warsaw, Poland
SUMMARY
Masters student and software developer, with past experience in Backend technologies, currently specialising in Natural Language Processing and AI.
EDUCATION
Technical University of Munich (TUM)
2022 – 2024 | Master of Science
Major: Computer Science
Specialization: Artificial Intelligence/ NLP
Subjects: Machine Learning, Intro to Deep Learning, Efficient Data structures and Algorithms, Natural Language Processing, Advanced NLP, Blockchain Based Systems Engineering, Strategic IT Management, Intro to Quantum Computing
GPA: 2.1
Warsaw University of Technology
2018 – 2022 | Bachelor of Science
Major: Computer Science
GPA: 4.56
Delhi Public School, RK Puram, New Delhi, India
2018 | High School
Percentage: 95.0
EXPERIENCE
Software Engineer - Google
April 2025 – Present | Warsaw, Poland
- › Working on Virutal Machine Lifecycle Management in GCP
Working student: NLP for Requirements - Mercedes Benz
Nov 2023 – Sept 2024 | Munich, Germany
- › Using current NLP techniques to automate linguistic analysis of Software requirements text
- › Developing a method for characterizing the complexity of requirement statements
Backend Ruby on Rails Dev - Toptal
May 2022 – Nov 2022 | Remote, USA
- › Responsible for making REST APIs for Toptal's employee system.
- › Mainly working with GraphQL and Ruby on Rails
Backend Python Developer - cthings.co
Mar 2021 – Nov 2021 | Warsaw, PL
- › Responsible for making REST APIs for NID Smart Manhole Project and MPWiK Pipelines Project. Extensively worked on the Reports functionality for the application to provide useful data for the customer for analytical and study purposes.
- › Mainly working with fastAPI and MongoDB
NLP Intern - pradhi.ai
Sept 2020 – Jan 2021 | Hyderabad, India
- › Worked on an NLP project: making a Q&A system which took large and technically complex documents as input and provided a valid answer to a question provided.
- › Used Data Science in Python especially Pandas, Natural Language Processing libraries like NLTK, Spacy and Google Cloud TPU to run sentence transformers like BERT.
Python Developer - Universality
Jan 2020 – May 2020 | Warsaw, PL
- › Worked on a project regarding implementation of education material regarding Python for students and teachers
- › Wrote tasks in Python to be solved by students and an automatic method of checking if solution is correct for the convenience of the teacher
PROJECTS
Hearfront.ai
Personal Project
- › By leveraging NLP, Hearfront aims to provide users a convenient way to give customer feedback and for feedback collectors, the application goes beyond general statistics.
- › For users, they can simply speak their feedback instead of typing lengthy responses or navigating text forms.
- › For businesses collecting feedback, we provide human-centric insights that are easy to understand and act on using Sentiment Analysis, Topic clustering, Feedback Summarization, Impact analysis, etc.
Masters Thesis | Reducing Computational Cost of Multilinguality in LLMs
Summer Semester 2024 | TUM (Social Computing Chair), Germany
- › Experimenting with morphological subword embeddings and optimizing tokenization achieving near state-of-the-art performance with 70% lesser parameters to train.
- › This thesis introduces a custom architecture that combines a curated Hindi-English dictionary and a BERT model fine-tuned using Low-Rank Adaptation (LoRA).
- › › Link to the thesis
NLP Lab Course: Green and Recyclable AI
Winter Semester 2023/24 | TUM (Social Computing Chair), Germany
- › This project consisted of identifying the most important weights over multiple batches of data and ad-hoc masking of weights to produce sparse matrices or recalculate new dense matrices.
- › Reduced training and inference costs using pruning techniques, resulting in 40% cost savings without degrading model performance.
- › Achieved performance comparable to baseline models with 30-50% fewer parameters
LISSA - Language Interface for Scientific Search Assistance: A University Project
Summer Semester 2023 | TUM (SEBIS Chair), Germany
- › Built a conversational agent using Rasa and Neo4j, improving search accuracy for scientific topics by 60%.
- › Goal was to handle different queries from the users, explain the topics related to these queries and provide scientific papers if needed. The bot was ideally to be used by people looking to get into scientific research in NLP.
- › Many experiments for topic recognition were implemented using GPT API, TF-IDF scoring, similarity search and few shot learning in LLMs (Setfit)
Evaluating texts generated by LLMs - Interdisciplinary Project in University(TUM)
Summer Semester 2023 | TUM, Germany
- › The goal of this IDP is to create a scoring function that gives a good score to good feedback and vice versa as seen from the human perspective.
- › Based on some researched parameters to score feedback, the plan is to incorporate human feedback and use RLHF to train a language model that can generate text which maximises this score.
- › Skills used and learnt: GPT API, Transformers and Dataset library on HuggingFace, Llama2, TRL for RLHF.
ALICE (CERN) Data Visualisation Thesis
Bachelor Thesis | WUT, Poland
- › Goal was to replicate the collission trajectories resulted from the experiments at ALICE, CERN. This project was implemented both in Python (with pythreejs) and React+JS(threejs) to compare Server Side and Client Side Rendering.
- › Skills used and learnt: Pythreejs, JavaScript, React, HTML/CSS
Other Projects
- › Army Act Project: his is the NLP project that I worked on in pradhi.ai. Started out by data cleaning and structuring using Pandas. Then using BERT and Semantic Search for finding answers for the QA system.
- › Gocery: Slot-booking system for grocery stores in order to facilitate social distancing
- › Penguin Group Project in C: A penguin game, originally know as "Hey! That's my fish" written in C It has an Automatic mode and also an Interactive mode.
- › Basketball Team Manager in C++: An elaborate program to make teams, assign coaches, make fan clubs and many other features that are required for organising a basketball league.
- › Quite a few of other universities projects have been completed a a part of my coursework. The languages range from Shell, Assembly and C to Python, Qt, Javascript. Most of them can be found on my Github
KEY SKILLS
- Python
- PyTorch/Scikit-Learn
- Hugging Face (Transformers, Datasets)
- Prompt Engineering
- GPT/BERT/LLMs
- Weights and Biases
- Rasa
- Pandas and Numpy
- Docker
- Git
- Zsh/Bash
- AWS
- MongoDB
- Javascript/Threejs
- SQL
- Ruby on Rails
- FastAPI
- C++