Pulkit Mehta
pulkitmehtawork
·
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 1 hour ago
Illustrating Reinforcement Learning from Human Feedback (RLHF)
replied to
natolambert's
post
about 1 hour ago
Today, we’re releasing our first pretrained Open Language Models (OLMo) at the Allen Institute for AI (AI2), a set of 7 billion parameter models and one 1 billion parameter variant. This line of work was probably the main reason I joined AI2 and is the biggest lever I see possible to enact meaningful change in how AI is used, studied, and discussed in the short term.
Links at the top because that's what you want:
* Core 7B model: https://huggingface.co./allenai/OLMo-7B
* 7B model twin (different GPU hardware): https://huggingface.co./allenai/OLMo-7B-Twin-2T
* 1B model: https://huggingface.co./allenai/OLMo-1B
* Dataset: https://huggingface.co./datasets/allenai/dolma
* Paper (arxiv soon): https://allenai.org/olmo/olmo-paper.pdf
* My personal blog post: https://www.interconnects.ai/p/olmo
OLMo will represent a new type of LLM enabling new approaches to ML research and deployment, because on a key axis of openness, OLMo represents something entirely different. OLMo is built for scientists to be able to develop research directions at every point in the development process and execute on them, which was previously not available due to incomplete information and tools.
Depending on the evaluation methods, OLMo 1 is either the best 7 billion parameter base model available for download or one of the best. This relies on a new way of thinking where models are judged on parameter plus token budget, similar to how scaling laws are measured for LLMs.
We're just getting started, so please help us learn how to be more scientific with LLMs!
reacted
to
natolambert's
post
with ❤️
about 1 hour ago
Today, we’re releasing our first pretrained Open Language Models (OLMo) at the Allen Institute for AI (AI2), a set of 7 billion parameter models and one 1 billion parameter variant. This line of work was probably the main reason I joined AI2 and is the biggest lever I see possible to enact meaningful change in how AI is used, studied, and discussed in the short term.
Links at the top because that's what you want:
* Core 7B model: https://huggingface.co./allenai/OLMo-7B
* 7B model twin (different GPU hardware): https://huggingface.co./allenai/OLMo-7B-Twin-2T
* 1B model: https://huggingface.co./allenai/OLMo-1B
* Dataset: https://huggingface.co./datasets/allenai/dolma
* Paper (arxiv soon): https://allenai.org/olmo/olmo-paper.pdf
* My personal blog post: https://www.interconnects.ai/p/olmo
OLMo will represent a new type of LLM enabling new approaches to ML research and deployment, because on a key axis of openness, OLMo represents something entirely different. OLMo is built for scientists to be able to develop research directions at every point in the development process and execute on them, which was previously not available due to incomplete information and tools.
Depending on the evaluation methods, OLMo 1 is either the best 7 billion parameter base model available for download or one of the best. This relies on a new way of thinking where models are judged on parameter plus token budget, similar to how scaling laws are measured for LLMs.
We're just getting started, so please help us learn how to be more scientific with LLMs!
Organizations
Collections
1
spaces
3
models
5

pulkitmehtawork/distil-bert-uncased-pulkit-sts
Sentence Similarity
•
Updated
•
17

pulkitmehtawork/ModernBERT-base-dutch-full
Updated

pulkitmehtawork/SmolLM2-FT-python
Text Generation
•
Updated
•
26

pulkitmehtawork/SmolLM2-FT-PulkitDataset
Text Generation
•
Updated
•
32

pulkitmehtawork/text_classification_pulkit
Text Classification
•
Updated
•
31