SNUH-HARI/DeepSeek-llama3.1-HARI-8B
Model Description
SNUH-HARI/DeepSeek-llama3.1-HARI-8B is a fine-tuned version of DeepSeek-llama3.1-Blossom with 8 billion parameters, optimized for healthcare applications. Developed by Healthcare AI Research Institute (HARI) at Seoul National University Hospital (SNUH), this model integrates medical open dataset (including synthesized data) and pseudonymized clinical notes to enhance patient safety and responsible AI in medicine.
- Architecture: Transformer-based large language model (LLM)
- Languages: English, Korean
- Primary Domains: Healthcare, General NLP
- Use Cases: Medical question answering, clinical decision support, patient safety applications
Training Details
Base Model: DeepSeek-llama3.1
Fine-Tuned Datasets:
- SNUH pseudonymized clinical notes for real-world medical knowledge
- MedicalLawQA (curated from Korea Legislation Research Institute data using GPT-4o-mini)
- Medical reasoning dataset from FreedomIntelligence/medical-o1-reasoning-SFT
Optimization: Mixed precision (FP16) for efficiency
Compute Resources: High-performance GPUs (e.g., NVIDIA H100 clusters)
Intended Use
This model is designed for research, healthcare AI, and legal AI applications. It is particularly suitable for:
- Medical question answering
- Clinical decision-making support
- Healthcare policy and compliance
Limitations & Ethical Considerations
- Not a replacement for medical professionals: Outputs should be validated by experts.
- Potential biases: Legal and medical knowledge are jurisdiction-specific; users should verify regional applicability.
- Privacy compliance: No personally identifiable information was used in training.
Evaluation & Benchmarks
This model was evaluated using 100 medical law-related QA pairs from the KMLE (Korean Medical Licensing Exam) 2019–2023 dataset.
Model | Accuracy (%) |
---|---|
DeepSeek-llama3.1-Bllossom-8B | 34 |
DeepSeek-llama3.1-HARI-8B (ours) | TBD |
How to Use
You can use the model via Hugging Face Transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "SNUH-HARI/DeepSeek-llama3.1-HARI-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "What are the legal requirements for prescribing narcotics in South Korea?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output = model.generate(input_ids, max_length=1024)
print(tokenizer.decode(output[0], skip_special_tokens=True))
License
This model is released under the MIT License.
Citation
If you use this model in your research, please cite:
@misc{SNUH-HARI-DeepSeek-llama3.1-HARI-8B,
title={SNUH-HARI/DeepSeek-llama3.1-HARI-8B},
author={Hyeonhoon Lee ([email protected])},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co./Seoul National University Hospital (SNUH)-HARI/DeepSeek-llama3.1-HARI-8B}
}
- Downloads last month
- 78
Model tree for SNUH-HARI/DeepSeek-llama3.1-HARI-8B
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B