Specializing in Hierarchical Reasoning Models, Transformers, and Scalable Web Applications. Transforming complex AI research into practical solutions.
I am a Computer Science graduate student at Felician University with a strong background in Artificial Intelligence and Full Stack Development. My expertise spans from building reasoning models and transformer architectures from scratch to managing IT operations and developing React Native applications.
With experience ranging from Google Summer of Code to leading development teams, I am passionate about pushing the boundaries of what's possible with code.
Directed IT helpdesk operations and a team of 4 technicians. Managed 20+ daily tickets to achieve 95% user satisfaction and improved team accuracy by 25%.
Provided technical support by resolving 2000+ helpdesk tickets, maintaining a 95% satisfaction rating, and assisting in supervising campus lab equipment and student workers.
Managed a team of 4 developers in building a React Native application. Streamlined the development process, cutting the initial launch timeline by 30%.
Developed a deep learning model for bone cancer detection using Python and TensorFlow (15% accuracy improvement). Engineered an optimized data preprocessing pipeline reducing image analysis time by 30%.
Associated with Mainly.ai
Engineered a brain-inspired model with coupled recurrent modules for abstract planning. Achieved SOTA performance (40.3% on ARC-AGI) with memory-efficient one-step gradient approximation.
Associated with Mainly.ai
Implemented a full Transformer using NumPy/PyTorch. Integrated multi-headed self-attention and achieved a SOTA BLEU score of 40 on English-German translation.
Associated with Mainly.ai
Adapted Transformer architecture for CV with patch embedding and class tokens. Achieved 85% top-1 accuracy on CIFAR-10/ImageNet classification.
A collection of advanced deep learning architectures and algorithms built from scratch.
Implementation of Research Paper
Implementations of various Transformer architectures including Multi-headed attention, Transformer XL, GPT, MLP-Mixer, ViT, and Switch Transformer.
Implementation of Research Paper
Implementation of Recurrent Highway Networks with enhanced depth and sequential processing capabilities.
Implementation of Research Paper
Deep learning models utilizing Long Short-Term Memory networks for processing sequential data.
Implementation of Research Paper
Implementation of HyperLSTM - utilizing a smaller network to generate weights for a larger LSTM network.
Implementation of Research Paper
Implementation of Residual Networks to train extremely deep neural networks via shortcut connections.
Implementation of Research Paper
Implementation of ConvMixer, substituting convolutions for self-attention and MLP operations in vision tasks.
Implementation of Research Paper
Implementation of Capsule Networks to better model hierarchical relationships in image classification.
Implementation of Research Paper
Implementations of GAN architectures including Original GAN, Deep Convolutional GAN, Cycle GAN, Wasserstein GAN, and StyleGAN 2.
Implementation of Research Paper
Implementations of Generative Diffusion models including Denoising Diffusion Probabilistic Models (DDPM).
Implementation of Research Paper
Implementation of Sketch RNN for generating vector-based drawings using seq2seq VAEs.
Implementation of Research Paper
Implementations of Graph Attention Networks (GAT) and Graph Attention Networks v2 (GATv2).
Implementation of Research Paper
Solving games with incomplete information, such as Kuhn Poker, using Counterfactual Regret Minimization (CFR).
Implementation of Research Paper
Implementations of RL algorithms like PPO, Deep Q Networks, Prioritized Replay, and Dueling Networks.
Implementation of Research Paper
Implementation of deep learning optimizers including Adam, AMSGrad, Adam with warmup, Noam, Rectified Adam, and AdaBelief.
Implementation of Research Paper
Implementations of Batch, Layer, Instance, Group, Batch-Channel Normalizations, and Weight Standardization.
Implementation of Research Paper
Implementation of Knowledge Distillation techniques to transfer knowledge to efficient models.
Implementation of Research Paper
Implementation of Adaptive Computation models like PonderNet to dynamically adjust computation steps.
Implementation of Research Paper
Utilizing Evidential Deep Learning to thoroughly quantify classification uncertainty in neural networks.
Relevant Coursework: Data Science, Data Mining, Artificial Intelligence
Relevant Coursework: Deep Learning, Linear Algebra, Calculus
My latest technical writings and thoughts published on Substack.
Fetching latest articles...
DeepLearning.AI
Issued Jun 2020
DeepLearning.AI
Issued Apr 2025
DeepLearning.AI
Issued Apr 2025
DeepLearning.AI
Issued Mar 2025
DeepLearning.AI
Issued Mar 2025
DeepLearning.AI
Issued Feb 2025
Stanford Online
Issued Jul 2022
Stanford Online
Issued Nov 5, 2024
Grade: 98.20%
Stanford Online
Issued Nov 4, 2024
Grade: 100%
Stanford Online
Issued Oct 31, 2024
Grade: 99.60%
Amazon Web Services (AWS)
Issued Jul 2023
Google Cloud (Coursera)
Issued Sep 2022
Google Cloud (Coursera)
Issued Jun 2022
CodeRed
Issued Feb 2023
CodeRed
Issued Feb 2023
University of Michigan
Issued Jun 2020
University of Michigan
Issued Sep 26, 2022
Grade: 99.17%
University of Michigan
Issued Sep 27, 2022
Grade: 100%
University of Michigan
Issued Oct 4, 2022
Grade: 95.80%
University of Michigan
Issued Nov 2, 2022
Grade: 96.88%
University of Michigan
Issued Oct 7, 2022
Grade: 92%
Coursera
Issued Dec 2022
HackerRank
Issued Jul 2021
I'm open to opportunities in AI, Machine Learning, and Full Stack Development.