Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consolidated.safetensors #9916

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

CrispStrobe
Copy link
Contributor

easier handling (as eg for Ministral)

easier handling (as eg for Ministral)
@github-actions github-actions bot added the python python script changes label Oct 16, 2024
Comment on lines 452 to 456
for filename in os.listdir(dir_model):
if filename.startswith(prefix) and filename.endswith(suffix):
if any(filename.startswith(prefix) for prefix in prefixes) and any(filename.endswith(suffix) for suffix in suffixes):
part_names.append(filename)
elif filename == "consolidated.safetensors":
part_names.append(filename)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are both model*.safetensors files and consolidated.safetensors in the same directory?

For example, https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1/ (which needs #9126) has both consolidated.safetensors and model-0000?-of-00003.safetensors.

Since git config --local lfs.fetchinclude <some_pattern> can be used to selectively download model files, I'm not sure how to handle that case if consolidated.safetensors is detected. I think the convert script should not use both at once (since duplicated tensor names are problematic), but how to choose?

What do you think?

Copy link
Contributor Author

@CrispStrobe CrispStrobe Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. something like below then?

def get_model_part_names(dir_model: Path, prefixes: list[str], suffixes: list[str]) -> list[str]:
        """
        Retrieves the list of model part filenames from the model directory.
        Prioritizes 'model-XXXX-of-XXXX.safetensors' files over 'consolidated.safetensors'.
        
        Parameters:
        - dir_model (Path): Path to the model directory.
        - prefixes (list[str]): List of filename prefixes to match.
        - suffixes (list[str]): List of filename suffixes to match.
        
        Returns:
        - list[str]: Sorted list of model part filenames.
        """
        part_names: list[str] = []
        
        # Collect files matching the given prefixes and suffixes
        for filename in os.listdir(dir_model):
            if any(filename.startswith(prefix) for prefix in prefixes) and any(filename.endswith(suffix) for suffix in suffixes):
                part_names.append(filename)
            elif filename == "consolidated.safetensors":
                part_names.append(filename)
        
        # Sort the list for consistency
        part_names.sort()
        
        # Check if both split files and 'consolidated.safetensors' are present
        split_files = [f for f in part_names if f.startswith("model-") and f.endswith(".safetensors")]
        consolidated_present = "consolidated.safetensors" in part_names
        
        if split_files and consolidated_present:
            logger.debug("Both split model files and 'consolidated.safetensors' found. Ignoring 'consolidated.safetensors'.")
            # Remove 'consolidated.safetensors' from part_names
            part_names = [f for f in part_names if f != "consolidated.safetensors"]
        
        # Final sort after potential removal
        part_names.sort()
        
        if not part_names:
            logger.warning("No model weight files found in the directory.")
        
        return part_names

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants