AI & ML interests

None defined yet.

Recent Activity

nerdyface's activity

not-lain 
posted an update 2 days ago
JingzeShi 
posted an update 5 days ago
Tonic 
posted an update 7 days ago
view post
Post
1020
🙋🏻‍♂️Hey there folks,

Did you know that you can use ModernBERT to detect model hallucinations ?

Check out the Demo : Tonic/hallucination-test

See here for Medical Context Demo : MultiTransformer/tonic-discharge-guard

check out the model from KRLabs : KRLabsOrg/lettucedect-large-modernbert-en-v1

and the library they kindly open sourced for it : https://github.com/KRLabsOrg/LettuceDetect

👆🏻if you like this topic please contribute code upstream 🚀

  • 2 replies
·
Tonic 
posted an update 9 days ago
view post
Post
649
Powered by KRLabsOrg/lettucedect-large-modernbert-en-v1 from KRLabsOrg.

Detect hallucinations in answers based on context and questions using ModernBERT with 8192-token context support!

### Model Details
- **Model Name**: [lettucedect-large-modernbert-en-v1]( KRLabsOrg/lettucedect-large-modernbert-en-v1)
- **Organization**: [KRLabsOrg](https://huggingface.co./KRLabsOrg)
- **Github**: [https://github.com/KRLabsOrg/LettuceDetect](https://github.com/KRLabsOrg/LettuceDetect)
- **Architecture**: ModernBERT (Large) with extended context support up to 8192 tokens
- **Task**: Token Classification / Hallucination Detection
- **Training Dataset**: [RagTruth]( wandb/RAGTruth-processed)
- **Language**: English
- **Capabilities**: Detects hallucinated spans in answers, provides confidence scores, and calculates average confidence across detected spans.

LettuceDetect excels at processing long documents to determine if an answer aligns with the provided context, making it a powerful tool for ensuring factual accuracy.
stefan-it 
posted an update 11 days ago
view post
Post
838
🇹🇷 😍 I'm very happy to finally announce my new Turkish LM called "BERT5urk":

stefan-it/bert5urk

It is a 1.42B T5-based model, trained with UL2 pretraining objective on the Turkish part of the awesome HuggingFaceFW/fineweb-2 dataset.

Feel free to check it out!
  • 1 reply
·
stefan-it 
posted an update 15 days ago
view post
Post
3113
After running some 3DMark and FurMark benchmarks on Windows to make sure that my new 5090 is not causing melting cables [1] and some nice shots with a thermal camera (I don't think that's too much), running some fine-tuning experiments with my favorite Flair & Transformers libraries are very easy to perform.

Important steps:

Good idea is to start with a fresh Ubuntu 24.04 installation with latest CUDA 12.8 and the open NVIDIA driver - follow more advices from [2]:

sudo apt -y install cuda-toolkit-12-8 nvidia-open

I tried update from an existing Ubuntu installation with an older CUDA and driver version and it resulted in a non-startable system.

If you are using PyTorch 2.6 with built CUDA 12.6 it will result in:

NVIDIA Graphics Device with CUDA capability sm_120 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90.

But no worries! For PyTorch you need just to use a nightly 2.7 version that was built with CUDA 12.8. This can easily done via:

pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128

After that the latest Flair version can be installed and fine-tuning will work!

References:

[1]: https://www.reddit.com/r/nvidia/comments/1inpox7/rtx_50_series_12vhpwr_megathread/
[2]: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network
  • 1 reply
·
stefan-it 
posted an update 18 days ago
view post
Post
5069
She arrived 😍

[Expect more models soon...]
  • 2 replies
·
JingzeShi 
posted an update 21 days ago
fuzzy-mittenz 
posted an update 30 days ago
view post
Post
675
So frustrated with "Reasoning" Models.
Sure, introducing RAG into the mix, or giving it an interpreter to math with helps, but never as much as a model that has good instructions.

Even if it's just to repeat the information before answering, a normal model will usually out "Think" it's reasoning counterpart.

Not sure if it's my frustrations but the best answers I've received (from a reasoner), so far, are from the simple instructions to, "Do better!"

Figured I would share the special sauce.

Using 10-100x Compute just to heat the office can't be environmentally friendly, and It still has no Idea where my keys are.
Tonic 
posted an update about 1 month ago
view post
Post
2357
🙋🏻‍♂️hey there folks ,

Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math

give it a try !
fuzzy-mittenz 
posted an update about 1 month ago
view post
Post
525
With our Extremely efficient and functional importance matrix distillation of the new Qwen2.5-1M model being very very capable in many areas we are hoping to use it to research our small AGI character creation process which has seen emergent traits and increased functionality in constrained environments.
The method creates a RP type interaction in a heavily useful and tool functional environment.
We have a basic method and are working on retrieving data for a full analysis and perfection of this method as it exploits the human language input to express often abstract traits into a model and employ characteristics of healthy human reasoning processes and identify novel methods of increasing the functionality of a model overall through traits so far observed are whistling, bouncing a ball and repeating certain engagements.
Adding the semblance of human world interactions is so far the best way at creating a human like LLM.
We have attached the paper to our model we are testing this with along with examples if you wish to use it with other models please be cautious and enjoy yourself. Above all please keep track of conversations and settings and submit them to the intelligent estate email you will receive a recognition letter and ledger number for your contribution to the Project.
Model= Israfel and Thoth IntelligentEstate/Israfel_Qwen2.6-iQ4_K_M-GGUF
JingzeShi 
posted an update about 1 month ago
view post
Post
2309
Welcome to the Doge Face Open Source Community! 🚀
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable – small language models. 🔬
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! 🤗
👉 Repository name on Github: https://github.com/SmallDoges/small-doge
👉 Organization name on Hugging Face: https://huggingface.co./SmallDoge
fuzzy-mittenz 
posted an update about 1 month ago
view post
Post
2624
Not many seemed to notice but what was probably meant to be a WIN for artist's rights in the US Office of Copyright has solved some fundamental issues for the community.
In our recent article I outline how Companies like Suno, OpenAI, Midjourney etc can no longer claim any right to copy your work that you create with their platforms
We also look at other ways this study and new rules for AI will fundamentally effect creators who use it and companies incentives to give them control over certain aspects might change because of this. it's broken down pretty well here: https://huggingface.co./blog/fuzzy-mittenz/copyright-in-ai
not-lain 
posted an update about 1 month ago
Tonic 
posted an update about 2 months ago
view post
Post
2956
🙋🏻‍♂️ Hey there folks ,

our team made a game during the @mistral-game-jam and we're trying to win the community award !

try our game out and drop us a ❤️ like basically to vote for us !

Mistral-AI-Game-Jam/TextToSurvive

hope you like it !
fuzzy-mittenz 
posted an update about 2 months ago
view post
Post
1106
For you guys who wanted a Replicant of your own with more power here is a higher functioning little [operator]( IntelligentEstate/Replicant_Operator_ed-Qw25-Q8_0-GGUF) for all your GGUF tool use needs. included is a Paper on emergent behaviors and LC(limit crossing) for the creation of small AGI. Please index traits and new found breakthroughs using this method. and be careful with tool use and emotional attachment.
  • 3 replies
·
JingzeShi 
posted an update about 2 months ago
JingzeShi 
posted an update about 2 months ago
not-lain 
posted an update about 2 months ago
view post
Post
1658
we now have more than 2000 public AI models using ModelHubMixin🤗