Implementation:EvolvingLMMs Lab Lmms eval IFEval Instructions Util
| Knowledge Sources | |
|---|---|
| Domains | Natural_Language_Processing, Text_Processing, Model_Evaluation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Utility functions for text processing and analysis in IFEval instruction following evaluation.
Description
This module provides utility functions supporting IFEval instruction validation including sentence splitting, word/sentence counting, keyword generation, and language code mappings. The implementation uses NLTK for tokenization and includes a comprehensive word list of 1,561 common English words for generating random keywords. It also provides ISO 639-1 language code mappings for 30 languages and specialized text processing functions for handling LaTeX-style formatting, abbreviations, and various punctuation patterns.
Usage
Use these utilities when implementing instruction checkers that need to count sentences/words, generate random keywords, split text into sentences, or work with language codes. The functions support the instruction validation logic in the IFEval framework.
Code Reference
Source Location
- Repository: EvolvingLMMs_Lab_Lmms_eval
- File: lmms_eval/tasks/ifeval/instructions_util.py
Signature
def split_into_sentences(text: str) -> list[str]:
"""Split the text into sentences."""
...
def count_words(text: str) -> int:
"""Counts the number of words."""
...
def count_sentences(text: str) -> int:
"""Count the number of sentences."""
...
def generate_keywords(num_keywords: int) -> list[str]:
"""Randomly generates a few keywords."""
...
def download_nltk_resources() -> None:
"""Download 'punkt' if not already installed"""
...
# Constants
WORD_LIST: list[str] # 1,561 common English words
LANGUAGE_CODES: immutabledict # 30 language codes to names
Import
from lmms_eval.tasks.ifeval import instructions_util
# Or import specific functions
from lmms_eval.tasks.ifeval.instructions_util import (
split_into_sentences,
count_words,
count_sentences,
generate_keywords,
WORD_LIST,
LANGUAGE_CODES,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| text | str | Yes | Text to process (for split_into_sentences, count_words, count_sentences) |
| num_keywords | int | Yes | Number of random keywords to generate (for generate_keywords) |
Outputs
| Name | Type | Description |
|---|---|---|
| sentences | list[str] | List of sentences (from split_into_sentences) |
| word_count | int | Number of words in text (from count_words) |
| sentence_count | int | Number of sentences in text (from count_sentences) |
| keywords | list[str] | List of randomly sampled keywords (from generate_keywords) |
Core Functions
Text Processing
split_into_sentences(text)
- Splits text into sentences handling complex cases
- Handles abbreviations (Mr., Dr., Ph.D., etc.)
- Handles acronyms (e.g., U.S.A.)
- Handles decimal numbers (e.g., 3.14)
- Handles websites (e.g., example.com)
- Handles quotation marks and punctuation
- Returns list of sentence strings
count_words(text)
- Counts words using NLTK RegexpTokenizer
- Pattern: r"\w+" (word characters)
- Returns integer count
count_sentences(text)
- Uses NLTK's punkt tokenizer
- Cached with functools.lru_cache
- Returns integer count
generate_keywords(num_keywords)
- Randomly samples from WORD_LIST
- Returns list of keyword strings
- Uses random.sample for unique selection
Constants
WORD_LIST
- Contains 1,561 common English words
- Used for generating random keywords
- Includes nouns, verbs, adjectives, and common words
LANGUAGE_CODES
- Immutable dictionary mapping ISO 639-1 codes to language names
- Supports 30 languages including:
* English (en), Spanish (es), French (fr), German (de) * Japanese (ja), Chinese (zh), Arabic (ar), Hindi (hi) * And 22 additional languages
- Used by ResponseLanguageChecker
Usage Examples
# Example 1: Split text into sentences
text = "Hello world. This is Dr. Smith. He works at U.S.A. The value is 3.14."
sentences = split_into_sentences(text)
# Returns: ['Hello world.', 'This is Dr. Smith.', 'He works at U.S.A.', 'The value is 3.14.']
# Example 2: Count words
text = "The quick brown fox jumps over the lazy dog."
num_words = count_words(text)
# Returns: 9
# Example 3: Count sentences
text = "First sentence. Second sentence! Third sentence?"
num_sentences = count_sentences(text)
# Returns: 3
# Example 4: Generate random keywords
keywords = generate_keywords(num_keywords=3)
# Returns: ['mountain', 'coffee', 'computer'] (example - random each time)
# Example 5: Access language codes
from lmms_eval.tasks.ifeval.instructions_util import LANGUAGE_CODES
language_name = LANGUAGE_CODES['fr']
# Returns: 'French'
all_languages = list(LANGUAGE_CODES.keys())
# Returns: ['en', 'es', 'pt', 'ar', 'hi', 'fr', ...]
# Example 6: Use in instruction checker
from lmms_eval.tasks.ifeval import instructions_util
class CustomChecker:
def check_word_count(self, response):
word_count = instructions_util.count_words(response)
return word_count >= 100
def check_sentence_count(self, response):
sentence_count = instructions_util.count_sentences(response)
return sentence_count >= 5
Implementation Details
Sentence Splitting Algorithm
The split_into_sentences function uses a sophisticated regex-based approach:
1. Preprocessing - Adds spaces and replaces newlines 2. Abbreviation handling - Marks abbreviations with <prd> placeholder 3. Website handling - Protects domain extensions (.com, .org, etc.) 4. Number handling - Protects decimal points in numbers 5. Multiple dots - Handles ellipsis and multiple dots 6. Acronym handling - Protects periods in acronyms 7. Special cases - Handles Ph.D. specially 8. Sentence boundaries - Splits on periods, question marks, exclamation marks 9. Cleanup - Restores protected periods and strips whitespace
Word Counting
Uses NLTK's RegexpTokenizer with pattern r"\w+" which:
- Matches word characters (letters, digits, underscores)
- Excludes punctuation and whitespace
- Handles contractions and hyphenated words
NLTK Resource Management
The module automatically downloads required NLTK resources:
- Checks for 'punkt' tokenizer availability
- Downloads on first use if not found
- Uses try-except to avoid repeated download attempts