Akhil Theerthala

Akhil-Theerthala

akhil-theerthala

AI & ML interests

None yet

Recent Activity

replied to burtenshaw's post 2 days ago

Here’s a notebook to make Gemma reason with GRPO & TRL. I made this whilst prepping the next unit of the reasoning course: In this notebooks I combine together google’s model with some community tooling - First, I load the model from the Hugging Face hub with transformers’s latest release for Gemma 3 - I use PEFT and bitsandbytes to get it running on Colab - Then, I took Will Browns processing and reward functions to make reasoning chains from GSM8k - Finally, I used TRL’s GRPOTrainer to train the model Next step is to bring Unsloth AI in, then ship it in the reasoning course. Links to notebook below. https://colab.research.google.com/drive/1Vkl69ytCS3bvOtV9_stRETMthlQXR4wX?usp=sharing

updated a dataset 2 days ago

Akhil-Theerthala/Personal-Finance-Queries

published a dataset 8 days ago

Akhil-Theerthala/Personal-Finance-Queries

View all activity

Organizations

None yet

Akhil-Theerthala's activity

replied to burtenshaw's post 2 days ago

Thanks. I was needing it.

updated a dataset 2 days ago

Akhil-Theerthala/Personal-Finance-Queries

Preview • Updated 2 days ago • 83

published a dataset 8 days ago

Akhil-Theerthala/Personal-Finance-Queries

Preview • Updated 2 days ago • 83

updated a dataset 25 days ago

Akhil-Theerthala/PersonalFinance-CoTR-5K

Viewer • Updated 25 days ago • 5.02k • 82 • 1

published a dataset 27 days ago

Akhil-Theerthala/PersonalFinance-CoTR-5K

Viewer • Updated 25 days ago • 5.02k • 82 • 1

upvoted an article about 2 months ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

Jan 16

• 44

replied to merve's post about 2 months ago

A fascinating week indeed!

reacted to merve's post with 🔥 about 2 months ago

Post

5258

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images

7 replies