Instructions to use NYTK/PULI-LlumiX-32K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NYTK/PULI-LlumiX-32K with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NYTK/PULI-LlumiX-32K", trust_remote_code=True)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("NYTK/PULI-LlumiX-32K", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("NYTK/PULI-LlumiX-32K", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use NYTK/PULI-LlumiX-32K with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NYTK/PULI-LlumiX-32K"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NYTK/PULI-LlumiX-32K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/NYTK/PULI-LlumiX-32K

SGLang

How to use NYTK/PULI-LlumiX-32K with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NYTK/PULI-LlumiX-32K" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NYTK/PULI-LlumiX-32K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NYTK/PULI-LlumiX-32K" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NYTK/PULI-LlumiX-32K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use NYTK/PULI-LlumiX-32K with Docker Model Runner:
```
docker model run hf.co/NYTK/PULI-LlumiX-32K
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

PULI LlumiX 32K base (6.74B billion parameter)

For further details or testing our instruct model, see our demo site.

Trained with OpenChatKit github
The LLaMA-2-7B-32K model were continuously pretrained on Hungarian dataset
The model has been extended to a context length of 32K with position interpolation
Checkpoint: 100 000 steps

Dataset for continued pretraining

Hungarian: 7.9 billion words, documents (763K) that exceed 5000 words in length
English: Long Context QA (1 billion words), BookSum (42 million words)

Limitations

max_seq_length = 32 768
float16
vocab size: 32 000

Usage with pipeline

from transformers import pipeline, LlamaForCausalLM, LlamaTokenizer

model = LlamaForCausalLM.from_pretrained("NYTK/PULI-LlumiX-32K")
tokenizer = LlamaTokenizer.from_pretrained("NYTK/PULI-LlumiX-32K")
prompt = "Elmesélek egy történetet a nyelvtechnológiáról."
generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)

print(generator(prompt, max_new_tokens=30)[0]["generated_text"])

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-llumix,
    title = {The First Instruct-Following Large Language Models for Hungarian},
    booktitle = {2024 IEEE 3rd Conference on Information Technology and Data Science (CITDS) Proceedings},
    year = {2024},
    publisher = {University of Debrecen},
    address = {Debrecen, Hungary},
    author = {Zijian {\relax Gy}őző Yang and Réka Dodé and Gerg\H{o} Ferenczi and  Péter Hatvani and Enik\H{o} Héja and Gábor Madarász and Noémi Ligeti-Nagy and Bence Sárossy and {\relax Zs}ófia Szaniszló and Tamás Váradi and Tamás Verebélyi and Gábor Prószéky},
    pages = {247--252},
    isbn = {9798350387889}
}

Downloads last month: 265

Safetensors

Model size

7B params

Tensor type

F16

Model tree for NYTK/PULI-LlumiX-32K

Adapters

1 model

Finetunes

5 models

Quantizations

3 models